Machine learning misclassification of academic publications reveals non-trivial interdependencies of scientific discipli

  • PDF / 1,473,607 Bytes
  • 14 Pages / 439.37 x 666.142 pts Page_size
  • 34 Downloads / 223 Views

DOWNLOAD

REPORT


Machine learning misclassification of academic publications reveals non‑trivial interdependencies of scientific disciplines Alexey Lyutov1   · Yilmaz Uygun1 · Marc‑Thorsten Hütt2 Received: 7 April 2020 / Accepted: 10 November 2020 © The Author(s) 2020

Abstract Exploring the production of knowledge with quantitative methods is the foundation of scientometrics. In an application of machine learning to scientometrics, we here consider the classification problem of the mapping of academic publications to the subcategories of a multidisciplinary journal—and hence to scientific disciplines—based on the information contained in the abstract. In contrast to standard classification tasks, we are not interested in maximizing the accuracy, but rather we ask, whether the failures of an automatic classification are systematic and contain information about the system under investigation. These failures can be represented as a ’misclassification network’ inter-relating scientific disciplines. Here we show that this misclassification network (1) gives a markedly different pattern of interdependencies among scientific disciplines than common ’maps of science’, (2) reveals a statistical association between misclassification and citation frequencies, and (3) allows disciplines to be classified as ’method lenders’ and ’content explorers’, based on their in-degree out-degree asymmetry. On a more general level, in a wide range of machine learning applications misclassification networks have the potential of extracting systemic information from the failed classifications, thus allowing to visualize and quantitatively assess those aspects of a complex system, which are not machine learnable. Keywords  Machine learning · Scientometrics · Maps of science · Classification algorithms · Interdisciplinary research

Electronic supplementary material  The online version of this article (https​://doi.org/10.1007/s1119​ 2-020-03789​-8) contains supplementary material, which is available to authorized users. * Alexey Lyutov a.lyutov@jacobs‑university.de 1

Department of Mathematics and Logistics, Jacobs University, Campus Ring 1, 28759 Bremen, Germany

2

Department of Life Sciences and Chemistry, Jacobs University, Campus Ring 1, 28759 Bremen, Germany



13

Vol.:(0123456789)

Scientometrics

Introduction The rich research landscape exploring the possibility of constructing ’maps of science’, allowing for a locally and globally accurate representation of the relationships, distances, and proximities of scientific disciplines (the ’scientific landscape’) is one of the cornerstones of scientometrics. This field of research provides quantitative analyses of the mechanisms, prerequisites, and predictors of academic success and the creation of meaningful representations of the interdependencies among scientific disciplines, as a basis for strategic decisions (Boyack et al. 2005; Leydesdorff 2001). Starting from the first networks of scientific publications Price (1965), which can be seen as the initiation of scientometrics, and the diverse approaches