A text sentiment classification model using double word embedding methods

  • PDF / 1,358,241 Bytes
  • 20 Pages / 439.37 x 666.142 pts Page_size
  • 83 Downloads / 233 Views

DOWNLOAD

REPORT


A text sentiment classification model using double word embedding methods Mingqiang Zhou 1,2

& Dan Liu

1,2

& Yanhui Zheng

1,2

& Qingsheng Zhu

1,2

& Ping Guo

1,2

Received: 24 April 2020 / Revised: 10 July 2020 / Accepted: 9 September 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

Sentiment analysis is an important topic in natural language processing (NLP) and text classifications. The existing algorithms of lexicon-based sentiment classification can deal with small corpus datasets or simple semantic texts. With the growth of text corpus data, word embedding methods have been gaining more attention. However, the single static word vector obtained by these methods can not accurately express the semantic information of the text. To optimize the word vector, we propose a text sentiment classification model using the double word embedding methods (DWE), which combines two models, GloVe and Word2vec, to represent the text to form a combinatory input of dual channels of convolution neural network (CNN). Based on the word vector fine-tuning strategy, the initial word vector is continuously learned and adjusted to find the CNN sentiment classification model with better combination input than a single vector representation. Experiment results show that DWE can effectively improve the accuracy of sentiment classification, which reaches 94.8%. Keywords Double Word Embedding . Convolutional Neural Network . Sentiment Classification . Natural Language Processing

1 Introduction Social networking information is related to all aspects of human life. Millions of people have expressed their feelings and attitudes on social and e-commerce platforms, with text being the most frequent way. Aiming at the text generated in the networks, it is very important for users, merchants, and researchers to extract this information and implied sentiments automatically and accurately [18]. For example, users can know the praise rate of the product through the * Mingqiang Zhou [email protected]

1

College of Computer Science, Chongqing University, Chongqing, China

2

Chongqing Key Laboratory of Software Theory and Technology, Chongqing, China

Multimedia Tools and Applications

evaluation information of the product; merchants can make a corresponding market strategy by analyzing the public opinion of the product [13]. Emotion classification is a method that can automatically identify these emotional tendencies, categorizing emotions as positive, neutral, and negative [21]. Existing research methods for sentiment classification include lexicon-based methods and machine learning methods, such as deep learning [7]. The lexicon-based sentiment classification methods mainly rely on the quality of sentiment lexicon, and the artificially constructed features that label it. Deep learning has made outstanding achievements in feature extraction, speech, and image processing, among which the CNN has also achieved some application results in text sentiment classification [11]. The key work of sentiment classifi