Constructing Context-Aware Sentiment Lexicons with an Asynchronous Game with a Purpose

One of the reasons sentiment lexicons do not reach human-level performance is that they lack the contexts that define the polarities of words. While obtaining this knowledge through machine learning would require huge amounts of data, context is commonsen

PDF / 421,420 Bytes
13 Pages / 439.363 x 666.131 pts Page_size
8 Downloads / 202 Views

DOWNLOAD

REPORT

bstract. One of the reasons sentiment lexicons do not reach human-level performance is that they lack the contexts that define the polarities of words. While obtaining this knowledge through machine learning would require huge amounts of data, context is commonsense knowledge for people, so human computation is a better choice. We identify context using a game with a purpose that increases the workers’ engagement in this complex task. With the contextual knowledge we obtain from only a small set of answers, we already halve the sentiment lexicons’ performance gap relative to human performance.

1 Introduction Sentiment analysis identifies expressions of subjectivity in texts, such as sentiments or emotional states. We consider the sentiment classification task, which determines whether the sentiments expressed in a text are positive or negative. This task requires commonsense knowledge about the polarities of sentiment words. The relative ease of construction led early researchers in the field toward corpusbased sentiment classification [1–3]. These methods aggregate statistical, syntactic, and semantic relations between words. A significant downside is that the classifiers that result are efficient only on narrow domains. This may be the reason why the competing, lexicon-based approach is currently the backbone of sentiment classification. Several sentiment lexicons [4–6] have been available for a significant period of time. However, multiple lexicons continue to appear [7, 8], showing that a satisfying solution has not yet been found. The most successful methods perform syntactic preprocessing to extract relevant words, and then consider the resulting set of independent words as features of the text. Sentiment classification is performed on these features, by adding word polarity scores compiled in sentiment lexicons or learned with statistical methods. These models obtain from 60% to 80% accuracy [2, 1]. Better results can sometimes be achieved by training domain-specific classifiers, but only at the expense of narrow coverage. This performance is lower than that of people, who can extract sentiment with 80% to 90% agreement [9], depending on the domain of the texts. A reason why these classifiers cannot reach human-level performance is that the words’ polarities are influenced by context: a small hotel room is negative, while a small digital camera is positive. By representing texts as independent words, context A. Gelbukh (Ed.): CICLing 2014, Part II, LNCS 8404, pp. 32–44, 2014. c Springer-Verlag Berlin Heidelberg 2014

Constructing Context-Aware Sentiment Lexicons

33

is ignored. In narrow domains, words mostly occur in a single context, thus high accuracy can be achieved. For broad domains, it is necessary to enrich the feature set with contexts, by including word combinations. However, the complexity of the resulting models would explode, and it would no longer be feasible to acquire them from data. Nevertheless, the polarity of most words has only a few exceptions, so the size of these models could be ma

Data Loading...

Constructing Context-Aware Sentiment Lexicons with an Asynchronous Game with a Purpose

Recommend Documents

Sentiment lexicons and non-English languages: a survey

An Introduction to Asynchronous Programming with Twisted

Constructing domain-dependent sentiment dictionary for sentiment analysis

Asynchronous Interactions Between Players and Game World

Constructing Finite Frames with a Given Spectrum

Symmetric Asynchronous Ratcheted Communication with Associated Data

Asynchronous Byzantine Agreement with Subquadratic Communication

Multi-purpose Syntax Definition with SDF3

Implementation and Optimization of PWM Technique for a Three-phase Inverter Associated with an Asynchronous Machine

MPC with Synchronous Security and Asynchronous Responsiveness

Multiculturalism in Canada Constructing a Model Multiculture with Mu

Constructing dynamic life tables with a single-factor model