Training error and sensitivity-based ensemble feature selection

PDF / 1,723,100 Bytes
14 Pages / 595.276 x 790.866 pts Page_size
53 Downloads / 235 Views

ORIGINAL ARTICLE

Training error and sensitivity‑based ensemble feature selection Wing W. Y. Ng1 · Yuxi Tuo1 · Jianjun Zhang1 · Sam Kwong2 Received: 22 October 2019 / Accepted: 22 March 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Ensemble feature selection combines feature selection and ensemble learning to improve the generalization capability of ensemble systems. However, current methods minimizing only the training error may not generalize well on future unseen samples. In this paper, we propose a training error and sensitivity-based ensemble feature selection method. The NSGA-III is applied to find optimal feature subsets by minimizing two objective functions of the whole ensemble system simultaneously: the training error and the sensitivity of the ensemble. With this scheme, the ensemble system maintains both high accuracy and high stability which is expected to achieve a high generalization capability. Experimental results on 18 datasets show that the proposed method significantly outperforms state-of-the-art methods. Keywords Ensemble · Feature selection · Sensitivity · NSGA-III

1 Introduction In recent years, in addition to the increase in number of samples, the dimensionality of datasets also increases greatly [1]. Certain features in the datasets may be redundant and∖ or noisy, which may cause confusion during learning [2] and increase both complexity and difficulty of classification. Therefore, feature selection is proposed to eliminate irrelevant and redundant features to improve classification performance and computational efficiency simultaneously. Instead of learning a single classification model, ensemble learning methods combining outputs of multiple models usually outperform single models [3]. Ensemble learning has been proven its effectiveness and also can be used to improve other machine learning disciplines such as feature selection [4]. Ensemble feature selection uses multiple feature selectors to find feature subsets for constructing an ensemble of accurate and diversified base models to create a more robust classification model [5].

* Jianjun Zhang [email protected] 1

Guangdong Provincial Key Lab of Computational Intelligence and Cyberspace Information School of Computer Science and Engineering, South China University of Technology, Guangzhou, China

Department of Computer Science, Hong Kong City University, Hong Kong, China

2

For many existing ensemble feature selection methods, feature subsets are selected independently by each base feature selector. However, the final classification model is a combination of an ensemble of classifiers trained by these feature subsets. Interaction among these individual models may be irrelevant, redundant, or even noisy [6],which may restrict the further improvement of the classification performance. This implies that combining the best individuals does not necessarily guarantee the final classifier ensemble to yield the best generalization capability. In addition, classifiers learnt from the selected feature sub

Data Loading...

Training error and sensitivity-based ensemble feature selection

Recommend Documents

A new ensemble feature selection approach based on genetic algorithm

Selection, Training and Evaluation

Feature Selection for Clustering

Local Feature Selection

Data Stream Mining in Fog Computing Environment with Feature Selection Using Ensemble of Swarm Search Algorithms

A Novel Approach for Ensemble Feature Selection Using Clustering with Automatic Threshold

Ensemble Feature Selection Method Based on Bio-inspired Algorithms for Multi-objective Classification Problem

Analysis of Ensemble Classifiers with Feature Selection for an Effective Intrusion Detection Model

Neural Network Ensemble Based on Feature Selection for Non-Invasive Recognition of Liver Fibrosis Stage

A Differential Evolution Approach to Feature Selection and Instance Selection

Multi-label feature ranking with ensemble methods

Clustering Ensemble Selection with Analytic Hierarchy Process