Feature selection by using chaotic cuckoo optimization algorithm with levy flight, opposition-based learning and disrupt

  • PDF / 2,921,673 Bytes
  • 23 Pages / 595.276 x 790.866 pts Page_size
  • 17 Downloads / 205 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789().,-volV)

METHODOLOGIES AND APPLICATION

Feature selection by using chaotic cuckoo optimization algorithm with levy flight, opposition-based learning and disruption operator Mahsa kelidari1 • Javad Hamidzadeh1

 Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Feature selection, which plays an important role in high-dimensional data analysis, is drawing increasing attention recently. Finding the most relevant and important features for classifications are one of the most important tasks of data mining and machine learning, since all of the datasets have irrelevant features that affect accuracy rate and slow down the classifier. Feature selection is an optimization process, which improves the accuracy rate of data classification and reduces the number of selected features. Applying too many features both requires a large memory capacity and leads to a slow execution speed. Feature selection algorithms are often responsible to decide which features should be selected to be used during a classification algorithm. Traditional algorithms seemed to be inefficient due to the complexity of dimensions of the problem, thus evolutionary algorithms were used to improve the problem solving process. The algorithm proposed in this paper, chaotic cuckoo optimization algorithm with levy flight, disruption operator and opposition-based learning (CCOALFDO), is applied to select the optimal feature subspace for classification. It reduces the randomization in selecting features and avoids getting stuck in local optimum solutions which lead to a more interesting feature subset. Extensive experiments are conducted on 20 high-dimensional datasets to demonstrate the effectiveness and efficiency of the proposed method. The results showed the superiority of the proposed method to state-of-the-art methods in terms of classification accuracy rate. In addition, they prove the ability of the CCOALFDO in selecting the most relevant features for classification tasks. Thus, it is a reasonable solution in handling noise and avoiding serious negative impacts on the classification accuracy rate in real world datasets. Keywords Feature selection  High-dimensional data  Cuckoo optimization algorithm  Chaotic theory  Levy flight  Disruption operator  Opposition-based learning

1 Introduction Feature selection in high dimensional data is a preprocessing step that has been widely used to improve the performance of learning algorithms in many fields (Mafarja et al. 2019). This preprocessing technique selects relevant features, removes other features, decreases the time

Communicated by V. Loia. & Javad Hamidzadeh [email protected] Mahsa kelidari [email protected] 1

Faculty of Computer Engineering and Information Technology, Sadjad University of Technology, Mashhad, Iran

required to learn the predictors and finally leads to an easier interpretation by simplifying the models (Thaher et al. 2020; Yan et al. 2019). Nowadays, feature selection is an important task in machine learnin