Sentiment Mining Using SVM-Based Hybrid Classification Model

With the rapid growth of social networks, opinions expressed in social networks play an influential role in day-to-day life. A need for a sentiment mining model arises, so as to enable the retrieval of opinions for decision making. Though support vector m

  • PDF / 476,405 Bytes
  • 8 Pages / 439.37 x 666.142 pts Page_size
  • 47 Downloads / 243 Views

DOWNLOAD

REPORT


Abstract With the rapid growth of social networks, opinions expressed in social networks play an influential role in day-to-day life. A need for a sentiment mining model arises, so as to enable the retrieval of opinions for decision making. Though support vector machine (SVM) has been proved to provide a good classification result in sentiment mining, the practically implemented SVM is often far from the theoretically expected level because their implementations are based on the approximated algorithms due to the high complexity of time and space. To improve the limited classification performance of the real SVM, we propose to use the hybrid model of SVM and principal component analysis (PCA). In this paper, we apply the concept of reducing the data dimensionality using PCA to decrease the complexity of an SVM-based sentiment classification task. The experimental results for the product reviews show that the proposed hybrid model of SVM with PCA outperforms a single SVM in terms of classification accuracy and receiveroperating characteristic curve (ROC). Keywords Sentiment

 Opinion  Mining  Hybrid model  PCA

1 Introduction With the rapid growth of e-commerce and large number of online reviews in digital form, the need to organize them arises. Various machine learning classifiers have been used in sentiment classification [8]. Many studies in machine learning communities have shown that combining individual classifiers is an effective technique for improving classification accuracy. There are different ways in which classifier can be combined to classify new instances. Dimension reduction plays an G. Vinodhini (&)  R. M. Chandrasekaran Department of Computer Science and Engineering, Annamalai University, Annamalai Nagar, Chidambaram 608002, India e-mail: [email protected]

G. S. S. Krishnan et al. (eds.), Computational Intelligence, Cyber Security and Computational Models, Advances in Intelligent Systems and Computing 246, DOI: 10.1007/978-81-322-1680-3_18,  Springer India 2014

155

156

G. Vinodhini and R. M. Chandrasekaran

important part in optimizing the performance of a classifier by reducing the feature vector size. Principal component analysis (PCA) can transform the original dataset of correlated variables into a smaller dataset of uncorrelated variables that are linear combinations of the original ones. Support vector machines (SVMs) have been recognized as one of the most successful classification methods for many applications including sentiment classification. Even though the learning ability and computational complexity of training in support vector machines may be independent of the dimension of the feature space, reducing computational complexity is an essential issue to efficiently handle a large number of terms in practical applications of text sentiment classification. In this study, we introduce a SVM-based hybrid sentiment classification model with PCA as dimension reduction for online product reviews using the product attributes as features. The results are compared with an individual