Feature Selection Method Based on Differential Correlation Information Entropy

  • PDF / 1,764,273 Bytes
  • 20 Pages / 439.37 x 666.142 pts Page_size
  • 85 Downloads / 256 Views

DOWNLOAD

REPORT


Feature Selection Method Based on Differential Correlation Information Entropy Xiujuan Wang1 · Yixuan Yan1

· Xiaoyue Ma1

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Feature selection is one of the major aspects of pattern classification systems. In previous studies, Ding and Peng recognized the importance of feature selection and proposed a minimum redundancy feature selection method to minimize redundant features for sequential selection in microarray gene expression data. However, since the minimum redundancy feature selection method is used mainly to measure the dependency between random variables of mutual information, the results cannot be optimal without consideration of global feature selection. Therefore, based on the framework of minimum redundancy-maximum correlation, this paper introduces entropy to measure global feature selection and proposes a new feature subset evaluation method, differential correlation information entropy. In our function, different bivariate correlation metrics are selected. Then, the feature selection is completed through sequence forward search. Two different classification models are used on eleven standard data sets of the UCI machine learning knowledge base to compare various comparison algorithms, such as mRMR, reliefF and feature selection method with joint maximal information entropy, with our method. The experimental results show that feature selection based on our proposed method is obviously superior to that of other models. Keywords Differential correlation information entropy · mRMR · Classification · Feature selection

1 Introduction In recent years, the exponential growth of data volume in various industries has brought new challenges to machine learning research. Unrelated and redundant data features have resulted in an increase in the computational complexity of machine learning models, which has had

B

Yixuan Yan [email protected] Xiujuan Wang [email protected] Xiaoyue Ma [email protected]

1

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China

123

X. Wang et al.

a great impact on the accuracy and efficiency of model learning, a phenomenon also called “the curse of dimensionality”. Feature selection can eliminate unrelated and redundant features to reduce spatial complexity and improve the accuracy and efficiency of machine learning models. The advantages of feature selection can be summarized as follows. (1) Dimension reduction can decrease the computational complexity of the learning models; (2) noise reduction can improve classification accuracy; and (3) more interpretable features can contribute to identifying and monitoring target diseases or function types [1]. The purpose of feature selection is to reduce the dimensionality of the data by removing features that are irrelevant or redundant [2]. In microarray gene expression, Ding and Peng proposed a filter-based method, called minimum redundancy maximum relevancy (mRMR), to find the optimum subset of genes [1,3]. A