An evolutionary computation-based approach for feature selection
- PDF / 1,185,720 Bytes
- 13 Pages / 595.276 x 790.866 pts Page_size
- 11 Downloads / 371 Views
ORIGINAL RESEARCH
An evolutionary computation‑based approach for feature selection Fateme Moslehi1 · Abdorrahman Haeri1 Received: 19 June 2019 / Accepted: 29 October 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract Feature selection plays an important role in the classification process to decrease the computational time, which can reduce the dimensionality of a dataset and improve the accuracy and efficiency of a machine learning task. Feature selection is a process that selects a subset of features based on the optimization criteria. Traditional statistical methods have been ineffective for two reasons, one being to increase the number of observations and the other to increase the number of features associated with an observation. Feature selection methods are a technique to reduce computational time, a better understanding of data, and improve the performance of machine learning and pattern recognition algorithms. The proper definition for solving the feature selection problem is to find a subset of minimum features so that it has the sufficient information for the purpose of problem and to increase the accuracy of the classification algorithm. Several techniques have been proposed to remove irrelevant and redundant features. In this paper, a novel feature selection algorithm that combines genetic algorithms (GA) and particle swarm optimization (PSO) for faster and better search capability is proposed. The hybrid algorithm makes use of the advantages of both PSO and GA methods. In order to evaluate the performance of these approaches, experiments were performed using seven real-world datasets. In this paper the gain ratio index is used to rank the features. The efficiency of the developed hybrid algorithm has been compared with the applicability of the basic algorithms. The results collected over real-world datasets represent the effectiveness of the developed algorithm. The algorithm was examined on seven data sets and the results demonstrate that the presented approach can achieve superior classification accuracy than the other methods. Keywords Feature selection · Evolutionary approach · Genetic algorithm · Particle swarm optimization (PSO) · Gain ratio index
1 Introduction Over recent years a set of high dimensional data and relatively low patterns have been produced along with the progress in science and technology. The goal of data mining process and knowledge discovery is to achieve intelligent decisions from a massive amount of data (Chen et al. 2011; Anagaw and Chang 2018). A large number of unrelated and inappropriate features reduces the learning speed and learning models. The growth of increases the complexity of building models is caused by the problem of is titled as the “curse of dimensionality” in data mining. This means * Abdorrahman Haeri [email protected] Fateme Moslehi [email protected] 1
School of Industrial Engineering, Iran University of Science and Technology, Tehran, Iran
irrelevant and redundant features have a negative effect and decrease
Data Loading...