An enhanced sentiment dictionary for domain adaptation with multi-domain dataset in Tamil language (ESD-DA)

  • PDF / 1,202,964 Bytes
  • 15 Pages / 595.276 x 790.866 pts Page_size
  • 11 Downloads / 181 Views

DOWNLOAD

REPORT


METHODOLOGIES AND APPLICATION

An enhanced sentiment dictionary for domain adaptation with multi-domain dataset in Tamil language (ESD-DA) E. Sivasankar1 · K. Krishnakumari2

· P. Balasubramanian1

© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Mostly sentiment analysis employs dictionary approaches for recognizing the polarity of terms in a review. However, in sentiment analysis between different domains called domain adaptation (DA), the sentiment lexicon disappoints that leads to the feature mismatch problem. Now, many e-commerce sites try to process reviews in their native languages. In this paper, we propose an enhanced dictionary in our native language (Tamil) that aims at building contextual relationships among the terms of multi-domain datasets that tries to minimize the feature mismatch problem. The proposed dictionary employs both labeled and unlabeled data from the source domain and unlabeled data from the target domain. More precisely, the initial dictionary explores pointwise mutual information for calculating contextual weight then the final dictionary estimates the rank score based on the importance of terms among all the reviews. This work intends to classify reviews of multiple target domains in Tamil by using the unified dictionary with a large number of vocabularies. This extendible dictionary significantly improves the accuracy of DA with the other baseline methods and handles many words in multiple domains with ease. Keywords Dictionary · Mutual information · Tamil language · Sentiment classification · Domain adaptation

1 Introduction In the ever-growing field of information and communication technology (ICT), many online customers are fascinated to view online reviews before purchasing the products. Proper analysis of customer reviews bolsters the quality of the product. The dramatic increase in the number of products improves the purchase rate of online products such as Amazon, an electronic commerce company launched more than three crores of products online. Most of the product-based Communicated by V. Loia.

B

K. Krishnakumari [email protected] E. Sivasankar [email protected] P. Balasubramanian [email protected]

1

Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, Tiruchirappalli 600 015, India

2

Department of Computer Science and Engineering, A.V.C. College of Engineering Mannampandal, Mayiladuthurai 609 305, India

companies receive feedback from the customers and improve the performance of the products accordingly. It is mandatory now to receive feedback for all products purchased and to be analyzed immediately for their quality branding. The reviews contribute a significant impact on the products’ purchase decisions. Sentiment analysis (SA) is the way of analyzing sentences, words, or documents and categorized as positive, negative, or neutral, depending upon the opinions or attitudes stated in the text (Liu 2012). Sentiment classification (SC) needs widespread knowledge of natural language processing