Feature selection based on maximal neighborhood discernibility

PDF / 2,761,611 Bytes
12 Pages / 595.276 x 790.866 pts Page_size
46 Downloads / 378 Views

ORIGINAL ARTICLE

Feature selection based on maximal neighborhood discernibility Changzhong Wang1 · Qiang He2 · Mingwen Shao3 · Qinghua Hu4

Received: 16 December 2016 / Accepted: 7 August 2017 © Springer-Verlag GmbH Germany 2017

Abstract Neighborhood rough set has been proven to be an effective tool for feature selection. In this model, the positive region of decision is used to evaluate the classification ability of a subset of candidate features. It is computed by just considering consistent samples. However, the classification ability is not only related to consistent samples, but also to the ability to discriminate samples with different decisions. Hence, the dependency function, constructed by the positive region, cannot reflect the actual classification ability of a feature subset. In this paper, we propose a new feature evaluation function for feature selection by using discernibility matrix. We first introduce the concept of neighborhood discernibility matrix to characterize the classification ability of a feature subset. We then present the relationship between distance matrix and discernibility matrix, and construct a feature evaluation function based on discernibility matrix. It is used to measure the significance of a candidate feature. The proposed model not only maintains the maximal dependency function, but also can select features with the greatest discernibility ability. The experimental results show that the proposed method can be used to deal with

* Qiang He [email protected] 1

Department of Mathematics, Bohai university, Jinzhou 121000, People’s Republic of China

2

College of Science, Beijing University of Civil Engineering and Architecture, Beijing 100044, People’s Republic of China

3

College of Computer and Communication Engineering, Chinese University of Petroleum, Qingdao, Shandong 266580, People’s Republic of China

4

School of Computer Science and Technology, Tianjin University, Tianjin 300072, People’s Republic of China

heterogeneous data sets. It is able to find effective feature subsets in comparison with some existing algorithms. Keywords Feature selection · Neighborhood · Rough sets · Discernibility matrix

1 Introduction With the development of information technology, more and more features are acquired and stored in databases. There may be some features that are not closely related to a classification task. Irrelevant or redundant features can increase the risk of a classifier to over-fit training data and easily lead to poor generalization ability. Feature selection or attribute reduction, as an important technique for reducing redundant features, has attracted much attention in machine learning and pattern recognition. Feature evaluation is a key issue in feature selection. It has great impact on optimal feature selection. In general, different feature evaluation functions may lead to different optimal feature subsets. A good evaluation function is always related to high classification performance. Until now, a great number of evaluation functions have be

Data Loading...

Feature selection based on maximal neighborhood discernibility

Recommend Documents

Feature Selection for Handwritten Signature Recognition Using Neighborhood Component Analysis

Cost-sensitive feature selection on multi-label data via neighborhood granularity and label enhancement

A context-aware recommendation approach based on feature selection

Rough Set-Based Feature Selection Techniques

A new ensemble feature selection approach based on genetic algorithm

Prediction Model of Breast Cancer Based on mRMR Feature Selection

Forward Iterative Feature Selection Based on Laplacian Score

Feature Selection Method Based on Differential Correlation Information Entropy

Unsupervised Hierarchical Feature Selection on Networked Data

Methods for Feature Selection in Down-Selection of Vaccine Regimens Based on Multivariate Immune Response Endpoints

Review on Deep Learning in Feature Selection

Feature Selection for Clustering