An application of MOGW optimization for feature selection in text classification

PDF / 1,839,338 Bytes
34 Pages / 439.37 x 666.142 pts Page_size
2 Downloads / 395 Views

An application of MOGW optimization for feature selection in text classification Razieh Asgarnezhad1 · S. Amirhassan Monadjemi2 · Mohammadreza Soltanaghaei1 Accepted: 23 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Due to extensive web applications, sentiment classification (SC) has become a relevant issue of interest among text mining experts. The extensive online reviews prevent the application of effective models to be used in companies and in the decision making of individuals. Pre-processing greatly contributes in sentiment classification. The traditional bag-of-words approaches do not record multiple relationships among words. In this study, emphasis is on the pre-processing stage and data reduction techniques, which would make a big difference in sentiment classification efficiency. To classify opinions, a multi-objective-grey wolf-optimization algorithm is proposed where the two objectives aim for decreasing the error of Naïve Bayes and K-nearest neighbour classifiers and a neural network as the final classifier. In evaluating this proposed framework, three datasets are applied. By obtaining 95.76% precision, 95.75% accuracy, 95.99% recall, and 95.82% f-measure, it is evident that this framework outperforms its counterparts. Keywords Sentiment classification · Feature selection · Multi-objective-grey wolfoptimization · Naïve bayes · K-nearest neighbour · Multi-layer neural network

* S. Amirhassan Monadjemi [email protected] Razieh Asgarnezhad [email protected] Mohammadreza Soltanaghaei [email protected] 1

Department of Computer Engineering, Isfahan (Khorasgan) Branch, Islamic Azad University, Isfahan, Iran

2

Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran

13

Vol.:(0123456789)

R. Asgarnezhad et al.

1 Introduction With the explosion of information on the Internet, it is hard to make decisions based on reviews, tweets, etc. People purchase products on the Internet and immediately express their opinions. These opinions have a significant effect on the financial statements of the involved companies. The main problem in this process is the nature of the natural language of the expressed opinions. There exists a big gap between opinions in natural language (i.e., unstructured data) and where structured data applications are applied [1]. The knowledge stored as text, documents, video, and voice media formats exceeds 80% of its volume. In the field of computer science, these documents have an unstructured nature. In knowledge extraction, realization is must before searching the implicit meanings and concepts. Idea mining in any text is attributed to the technical phase of what humans can search for. Keywords are the keys sought by the search engines in finding text data, based on the probable presented facts, not ideas. Expressing ideas through keywords is impossible [2]. Sentiment classification (SC) is an appealing field in text mining. The extracted opinions from the unstructured data on the Internet become class

Data Loading...

An application of MOGW optimization for feature selection in text classification

Recommend Documents

Binary Text Representation for Feature Selection

Text Classification Using K-Nearest Neighbor Algorithm and Firefly Algorithm for Text Feature Selection

An Adapting Chemotaxis Bacterial Foraging Optimization Algorithm for Feature Selection in Classification

An Enhancing Grasshopper Optimization for Efficient Feature Selection

A comparative study of feature selection methods for binary text streams classification

Feature selection based on term frequency deviation rate for text classification

Application of Automatic Text-Classification Algorithm Based on Feature Extraction for Intelligent System of Transportat

A Parallel Global TFIDF Feature Selection Using Hadoop for Big Data Text Classification

Feature Selection and Extraction for Dogri Text Summarization

Univariate Feature Selection Techniques for Classification of Epileptic EEG Signals

A new BAT optimization algorithm based feature selection method for electrocardiogram heartbeat classification using emp

Application of Feature Extraction in Text-to-Speech Processing