Novel approach with nature-inspired and ensemble techniques for optimal text classification
- PDF / 3,820,260 Bytes
- 28 Pages / 439.37 x 666.142 pts Page_size
- 92 Downloads / 169 Views
Novel approach with nature-inspired and ensemble techniques for optimal text classification Anshu Khurana 1
& Om Prakash Verma
2
Received: 31 May 2019 / Revised: 1 March 2020 / Accepted: 1 May 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract
Text classification reduces the time complexity and space complexity by dividing the complete task into the different classes. The main problem with text classification is a vast number of features extracted from the textual data. Pre-processed dataset have many features, some of which are not desirable and act only like noise. In this paper, a novel approach for optimal text classification based on nature-inspired algorithm and ensemble classifier is proposed. In the proposed model, feature selection was performed with Biogeography Based Optimization (BBO) algorithm along with ensemble classifiers (Bagging). The use of ensemble classifiers for classification delivers better performance for optimal text classification as compared to an individual classifier, and hence, improving the accuracy. Ensemble classifiers combines the weakness of individual classifiers. The individual classifiers are unable to improve the classification results when compared to ensemble classifier. The selected features, after feature selection using BBO algorithm, are classified into various classes using six machine learning classifier. The experimental results are computed on ten text classification datasets taken from UCI repository and one real-time dataset of an airlines. The four different measures namely; Accuracy, Precision, Recall and F- measure are used to validate performance of our model with ten-fold crossvalidation. For feature selection process, a comparison is performed among state-of-theart algorithms available in the literature. Results shows that BBO for feature selection outperforms the other similar nature-based optimization techniques. Our proposed approach of BBO with ensemble classifier is also compared with techniques proposed by other researchers and we analyzed the results quantitatively and qualitatively. Keywords Text classification . Feature selection . Nature-based optimization . Machine learning classifier . Ensemble classifier . Biogeography based optimization
* Anshu Khurana [email protected]
1
Department of Computer Science and Engineering, Delhi Technological University, New Delhi, India
2
Department of Electronics and Communication, Delhi Technological University, New Delhi, India
Multimedia Tools and Applications
1 Introduction As there is a rise in research areas related to data mining, the advancement in Information Communication and Technology (ICT), gives opportunities to all users to access the information quickly and at a faster rate. Due to increase in demand for information, there is a simultaneously increase in the number of text documents, as the availability and accessibility of the digital information are saved and organized in the forge of text [26]. The application area of text classification is widely
Data Loading...