A Novel Feature Selection Method for Classification Using a Fuzzy Criterion
Although many classification methods take advantage of fuzzy sets theory, the same cannot be said for feature reduction methods. In this paper we explore ideas related to the use of fuzzy sets and we propose a novel fuzzy feature selection method tailored
- PDF / 664,799 Bytes
- 13 Pages / 439.37 x 666.142 pts Page_size
- 42 Downloads / 232 Views
3
High Performance Computing and Networking Institute, National Research Council, Naples, Italy 2 Department of European and Mediterranean Studies, Second University of Naples, Caserta, Italy Department of Informatics, Kaunas University of Technology, Kaunas, Lithuania [email protected]
Abstract. Although many classification methods take advantage of fuzzy sets theory, the same cannot be said for feature reduction methods. In this paper we explore ideas related to the use of fuzzy sets and we propose a novel fuzzy feature selection method tailored for the Regularized Generalized Eigenvalue Classifier (ReGEC). The method provides small and robust subsets of features that can be used for supervised classification. We show, using real world datasets that the performance of ReGEC classifier on the selected features well compares with that obtained using them all.
1
Introduction
In many practical situations, the size and dimensionality of datasets is large and many irrelevant and redundant features are included. In a classification context, learning from huge datasets could not work well even if theoretically more features should lead more discriminant power. In order to face with this problem two kinds of algorithms can be used: feature transformation (or extraction) and feature selection. Feature transformation consists in constructing new features (in a lower dimentional space) from the original ones. These methods include clustering, basic linear transforms of the input variables (Principal Component Analysis/Singular Value Decomposition, Linear Discriminant Analysis), spectral transforms, wavelet transforms or convolution of kernels. The basic idea of a feature transformation is simply projecting a high-dimensional feature vector onto a low-dimensional space. Unfortunately, the projection leads a loss of the measurement units of features and the obtained features are not easy to interpret. Feature selection (FS) may overcome this disadvantages. FS aims at selecting a subset of features relevant in terms of discrimination capability. It avoids the drawback of the output interpretability, because the selected features represent a subset of the given ones. FS is used as a preprocessing phase in many contexts. It plays an important role in applications that involve a large number of features and only few samples. FS enables data G. Nicosia and P. Pardalos (Eds.): LION 7, LNCS 7997, pp. 455–467, 2013. c Springer-Verlag Berlin Heidelberg 2013 DOI: 10.1007/978-3-642-44973-4 49,
456
M.B. Ferraro et al.
mining algorithms to run when it is otherwise impossible given the dimensionality of the dataset. Furthermore, it permits to focus only on relevant features and to avoid redundant information. FS strategy consists of the following steps. From the original set of features, a candidate subset is generated and then evaluated by means of an evaluation criterion. The goodness of each subset is analyzed and, if it fulfills the stopping rule, it is selected and validated in order to check whether the subset is valid. Oth
Data Loading...