A new ensemble feature selection approach based on genetic algorithm

PDF / 522,705 Bytes
10 Pages / 595.276 x 790.866 pts Page_size
98 Downloads / 261 Views

METHODOLOGIES AND APPLICATION

A new ensemble feature selection approach based on genetic algorithm Hongzhi Wang1

· Chengquan He1 · Zhuping Li1

© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract In the ensemble feature selection method, if the weight adjustment is performed on each feature subset used, the ensemble effect can be significantly different; therefore, how to find the optimized weight vector is a key and challenging problem. Aiming at this optimization problem, this paper proposes an ensemble feature selection approach based on genetic algorithm (EFS-BGA). After each base feature selector generates a feature subset, the EFS-BGA method obtains the optimized weight of each feature subset through genetic algorithm, which is different from traditional genetic algorithm directly processing single features. We divide the EFS-BGA algorithm into two types. The first is a complete ensemble feature selection method; based on the first, we further propose the selective EFS-BGA model. After that, through mathematical analysis, we theoretically explain why weight adjustment is an optimization problem and how to optimize. Finally, through the comparative experiments on multiple data sets, the advantages of the EFS-BGA algorithm in this paper over the previous ensemble feature selection algorithms are explained in practice. Keywords Ensemble feature selection · Optimization problem · Genetic algorithm

1 Introduction The ensemble feature selection method is to generate an optimized feature subset from a plurality of feature subsets that have been obtained by using some integration strategy (Saeys et al. 2008). If multiple feature selection algorithms are used on the same training set to obtain multiple feature subsets, this is heterogeneous ensemble; if the same feature selection algorithm is used on different training sets, it is homogenous ensemble. In our previous work, we analyzed the heterogeneous ensemble feature selection method. This paper mainly analyzes the homogenous ensemble feature selection method, that is, the original data set is sampled multiple times by the boostrap method, and multiple training subsets are generated to train the same feature selection algorithm. Because of the different training data, the subsets of features generated by the training are different too (Mitchell et al. 2014). By adjusting the weights of the feature Communicated by V. Loia.

B 1

Hongzhi Wang [email protected] Harbin Institute of Technology, Harbin, China

subsets, the contribution of the feature subset to the training performance can be changed. After the following theoretical analysis, it can be concluded that the generalization error of the weighted ensemble feature selection model is better than that of the unweighted ensemble feature selection model. For multiple feature subsets, how to get the optimized weight vector could be considered as an optimization problem. Aiming at this optimization problem, this paper proposes a new ensemble feature selection algorithm that uses genetic algorithm

Data Loading...

A new ensemble feature selection approach based on genetic algorithm

Recommend Documents

Feature Selection Optimization Using a Hybrid Genetic Algorithm

A context-aware recommendation approach based on feature selection

Hybrid Efficient Genetic Algorithm for Big Data Feature Selection Problems

Training error and sensitivity-based ensemble feature selection

A Robust Method for Multi-algorithmic Palmprint Recognition Using Exponential Genetic Algorithm-Based Feature Selection

A GA-Based Feature Selection Algorithm for Remote Sensing Images

Template Selection for Lookup Table Based on Genetic Algorithm

A new BAT optimization algorithm based feature selection method for electrocardiogram heartbeat classification using emp

Feature selection with multi-objective genetic algorithm based on a hybrid filter and the symmetrical complementary coef

A new feature selection using dynamic interaction

A Differential Evolution Approach to Feature Selection and Instance Selection

A Novel Approach for Ensemble Feature Selection Using Clustering with Automatic Threshold