Joint feature and instance selection using manifold data criteria: application to image classification

  • PDF / 1,372,570 Bytes
  • 31 Pages / 439.37 x 666.142 pts Page_size
  • 93 Downloads / 204 Views

DOWNLOAD

REPORT


Joint feature and instance selection using manifold data criteria: application to image classification Fadi Dornaika1,2 

© Springer Nature B.V. 2020

Abstract In many pattern recognition applications feature selection and instance selection can be used as two data preprocessing methods that aim at reducing the computational cost of the learning process. Moreover, in some cases, feature subset selection can improve the classification performance. Feature selection and instance selection can be interesting since the choice of features and instances greatly influence the performance of the learnt models as well as their training costs. In the past, unifying both problems was carried out by solving a global optimization problem using meta-heuristics. This paradigm not only does not exploit the manifold structure of data but can be computationally expensive. To the best of our knowledge, the joint use of sparse modeling representative and feature subset relevance have not been exploited by the joint feature and selection methods. In this paper, we target the joint feature and instance selection by adopting feature subset relevance and sparse modeling representative selection. More precisely, we propose three schemes for the joint feature and instance selection. The first is a wrapper technique while the two remaining ones are filter approaches. In the filter approaches, the search process adopts a genetic algorithm in which the evaluation is mainly given by a score that quantify the goodness of the features and instances. An efficient instance selection technique is used and integrated in the search process in order to adapt the instances to the candidate feature subset. We evaluate the performance of the proposed schemes using image classification where classifiers are the nearest neighbor classifier and support vector machine classifier. The study is conducted on five public image datasets. These experiments show the superiority of the proposed schemes over various baselines. The results confirm that the filter approaches leads to promising improvement on classification accuracy when both feature selection and instance selection are adopted. Keywords  Feature selection · Instance selection · Feature and instance selection · Data reduction · Linear discriminant analysis (LDA) · Local discriminant embedding (LDE) · Classification

* Fadi Dornaika [email protected] 1

University of the Basque Country UPV/EHU, San Sebastian, Spain

2

IKERBASQUE, Basque Foundation for Science, Bilbao, Spain



13

Vol.:(0123456789)

F. Dornaika

1 Introduction Feature Selection (or dimensionality reduction) and Instance Selection (or record reduction) have been an active research topic in preprocessing of datasets of images, videos, and texts as they promise fast performance and reduced complexity in classification problems. The importance of these tasks is increasingly recognized in most real-world datasets problems that are addressed in machine learning and autonomous systems. Feature Selection methods reduce the dimensionality of