Saier, TarekFärber, MichaelTsereteli, Tornike2023-04-142023-04-1420211432-50121432-13001843484552http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-129653http://elib.uni-stuttgart.de/handle/11682/12965http://dx.doi.org/10.18419/opus-12946Citation information in scholarly data is an important source of insight into the reception of publications and the scholarly discourse. Outcomes of citation analyses and the applicability of citation-based machine learning approaches heavily depend on the completeness of such data. One particular shortcoming of scholarly data nowadays is that non-English publications are often not included in data sets, or that language metadata is not available. Because of this, citations between publications of differing languages (cross-lingual citations) have only been studied to a very limited degree. In this paper, we present an analysis of cross-lingual citations based on over one million English papers, spanning three scientific disciplines and a time span of three decades. Our investigation covers differences between cited languages and disciplines, trends over time, and the usage characteristics as well as impact of cross-lingual citations. Among our findings are an increasing rate of citations to publications written in Chinese, citations being primarily to local non-English languages, and consistency in citation intent between cross- and monolingual citations. To facilitate further research, we make our collected data and source code publicly available.eninfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/4.0/004400Cross-lingual citations in English papers : a large-scale analysis of prevalence, usage, and impactarticle2023-03-24