MEmoFC: introducing the Multilingual Emotional Football Corpus

  • PDF / 764,498 Bytes
  • 42 Pages / 439.37 x 666.142 pts Page_size
  • 86 Downloads / 208 Views

DOWNLOAD

REPORT


MEmoFC: introducing the Multilingual Emotional Football Corpus Nadine Braun1 • Chris van der Lee1 Lorenzo Gatti2 • Martijn Goudbeek1 Emiel Krahmer1

• •

Accepted: 24 September 2020  The Author(s) 2020

Abstract This paper introduces a new corpus of paired football match reports, the Multilingual Emotional Football Corpus, (MEmoFC), which has been manually collected from English, German, and Dutch websites of individual football clubs to investigate the way different emotional states (e.g. happiness for winning and disappointment for losing) are realized in written language. In addition to the reports, it also contains the statistics for the selected matches. MEmoFC is a corpus consisting of comparable subcorpora since the authors of the texts report on the same event from two different perspectives—the winner’s and the loser’s side, and from an arguably more neutral perspective in tied matches. We demonstrate how the corpus can be used to investigate the influence of affect on the reports through different approaches and illustrate how game outcome influences (1) references to the own team and the opponent, and (2) the use of positive and negative emotion terms in the different languages. The MEmoFC corpus, together with the analyzed aspects of

& Nadine Braun [email protected] Chris van der Lee [email protected] Lorenzo Gatti [email protected] Martijn Goudbeek [email protected] Emiel Krahmer [email protected] 1

Tilburg Center for Cognition and Communication (TiCC), Tilburg University, 5037 AB Tilburg, The Netherlands

2

Human Media Interaction Lab, University of Twente, Zilverling 2082, 7500 AE Enschede, The Netherlands

123

N. Braun et al.

emotional language will open up new approaches for targeted automatic generation of texts. Keywords Affect  Emotion  Multilingual corpus  Comparable corpora  Natural language generation  Sports  Reportage

1 Introduction This paper introduces the Multilingual Emotional Football Corpus (MEmoFC),1 a new corpus consisting of pairs of football reports, which can be used for the study of affective language. We present the text corpus in three languages, English, Dutch, and German, combined with the matching football game statistics, as a resource for investigating how (affective) perspective can change reporting about an event. To the best of our knowledge, this multilingual corpus is the first one where objective data and textual realizations from multiple affective perspectives are systematically combined. Sports reportage provided by sports clubs themselves is arguably one of the most interesting registers available for linguistic analyses of affect-laden language from different perspectives. It opens up room for creative language, starting already with the headlines of the match reports (Smith and Montgomery 1989). Additionally, the point of view of the author of a match report is clearly definable from the beginning, as it is either a reaction to a tie (that might still be perceived as a net loss or win by the team) or, depending on the perspective, a loss or a win