Training error and sensitivity-based ensemble feature selection

  • PDF / 1,723,100 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 53 Downloads / 204 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Training error and sensitivity‑based ensemble feature selection Wing W. Y. Ng1   · Yuxi Tuo1 · Jianjun Zhang1 · Sam Kwong2 Received: 22 October 2019 / Accepted: 22 March 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Ensemble feature selection combines feature selection and ensemble learning to improve the generalization capability of ensemble systems. However, current methods minimizing only the training error may not generalize well on future unseen samples. In this paper, we propose a training error and sensitivity-based ensemble feature selection method. The NSGA-III is applied to find optimal feature subsets by minimizing two objective functions of the whole ensemble system simultaneously: the training error and the sensitivity of the ensemble. With this scheme, the ensemble system maintains both high accuracy and high stability which is expected to achieve a high generalization capability. Experimental results on 18 datasets show that the proposed method significantly outperforms state-of-the-art methods. Keywords  Ensemble · Feature selection · Sensitivity · NSGA-III

1 Introduction In recent years, in addition to the increase in number of samples, the dimensionality of datasets also increases greatly [1]. Certain features in the datasets may be redundant and∖ or noisy, which may cause confusion during learning [2] and increase both complexity and difficulty of classification. Therefore, feature selection is proposed to eliminate irrelevant and redundant features to improve classification performance and computational efficiency simultaneously. Instead of learning a single classification model, ensemble learning methods combining outputs of multiple models usually outperform single models [3]. Ensemble learning has been proven its effectiveness and also can be used to improve other machine learning disciplines such as feature selection [4]. Ensemble feature selection uses multiple feature selectors to find feature subsets for constructing an ensemble of accurate and diversified base models to create a more robust classification model [5].

* Jianjun Zhang [email protected] 1



Guangdong Provincial Key Lab of Computational Intelligence and Cyberspace Information School of Computer Science and Engineering, South China University of Technology, Guangzhou, China



Department of Computer Science, Hong Kong City University, Hong Kong, China

2

For many existing ensemble feature selection methods, feature subsets are selected independently by each base feature selector. However, the final classification model is a combination of an ensemble of classifiers trained by these feature subsets. Interaction among these individual models may be irrelevant, redundant, or even noisy [6],which may restrict the further improvement of the classification performance. This implies that combining the best individuals does not necessarily guarantee the final classifier ensemble to yield the best generalization capability. In addition, classifiers learnt from the selected feature sub