Humanistic interpretation and machine learning

  • PDF / 479,287 Bytes
  • 37 Pages / 439.37 x 666.142 pts Page_size
  • 111 Downloads / 207 Views

DOWNLOAD

REPORT


Humanistic interpretation and machine learning Juho Pääkkönen1,2

· Petri Ylikoski1,3

Received: 31 October 2019 / Accepted: 22 July 2020 © The Author(s) 2020

Abstract This paper investigates how unsupervised machine learning methods might make hermeneutic interpretive text analysis more objective in the social sciences. Through a close examination of the uses of topic modeling—a popular unsupervised approach in the social sciences—it argues that the primary way in which unsupervised learning supports interpretation is by allowing interpreters to discover unanticipated information in larger and more diverse corpora and by improving the transparency of the interpretive process. This view highlights that unsupervised modeling does not eliminate the researchers’ judgments from the process of producing evidence for social scientific theories. The paper shows this by distinguishing between two prevalent attitudes toward topic modeling, i.e., topic realism and topic instrumentalism. Under neither can modeling provide social scientific evidence without the researchers’ interpretive engagement with the original text materials. Thus the unsupervised text analysis cannot improve the objectivity of interpretation by alleviating the problem of underdetermination in interpretive debate. The paper argues that the sense in which unsupervised methods can improve objectivity is by providing researchers with the resources to justify to others that their interpretations are correct. This kind of objectivity seeks to reduce suspicions in collective debate that interpretations are the products of arbitrary processes influenced by the researchers’ idiosyncratic decisions or starting points. The paper discusses this view in relation to alternative approaches to formalizing interpretation and identifies several limitations on what unsupervised learning can be expected to achieve in terms of supporting interpretive work. Keywords Humanistic interpretation · Topic modeling · Machine learning · Objectivity · Text analytics · Latent Dirichlet allocation

B

Juho Pääkkönen [email protected]

1

Sociology, University of Helsinki, Helsinki, Finland

2

Computer Science, Aalto University, Espoo, Finland

3

Institute for Analytical Sociology, Linköping University, Norrköping, Sweden

123

Synthese

1 Introduction The objectivity of interpretive text analysis—humanistic interpretation1 —has been a hot potato in the social sciences since their beginning. The necessity of humanistic interpretation has been generally recognized, but many have retained their suspicions about the sources of bias that could influence the interpretive process. Thus attempts have been made to formalize the interpretive process to make it more transparent and to control some possible biases. These attempts have met with opposition. In particular, formal approaches based on coding have been argued to be limited in terms of replicability and in their ability to account for nuances in textual meaning. At worst, coding procedures have been argued to impose interpret