Discriminative Feature Selection via Multiclass Variable Memory Markov Model

PDF / 764,248 Bytes
10 Pages / 595.2 x 792 pts Page_size
61 Downloads / 317 Views

Discriminative Feature Selection via Multiclass Variable Memory Markov Model Noam Slonim School of Engineering and Computer Science and Interdisciplinary Center for Neural Computation, The Hebrew University of Jerusalem, Jerusalem 91904, Israel Email: [email protected]

Gill Bejerano School of Engineering and Computer Science, The Hebrew University of Jerusalem, Jerusalem 91904, Israel Email: [email protected]

Shai Fine IBM Research Laboratory in Haifa, Haifa University, Mount Carmel, Haifa 31905, Israel Email: [email protected]

Naftali Tishby School of Engineering and Computer Science and Interdisciplinary Center for Neural Computation, The Hebrew University of Jerusalem, Jerusalem 91904, Israel Email: [email protected] Received 18 April 2002 and in revised form 15 November 2002 We propose a novel feature selection method based on a variable memory Markov (VMM) model. The VMM was originally proposed as a generative model trying to preserve the original source statistics from training data. We extend this technique to simultaneously handle several sources, and further apply a new criterion to prune out nondiscriminative features out of the model. This results in a multiclass discriminative VMM (DVMM), which is highly eﬃcient, scaling linearly with data size. Moreover, we suggest a natural scheme to sort the remaining features based on their discriminative power with respect to the sources at hand. We demonstrate the utility of our method for text and protein classification tasks. Keywords and phrases: variable memory Markov (VMM) model, feature selection, multiclass discriminative analysis.

1. INTRODUCTION Feature selection is one of the most fundamental problems in pattern recognition and machine learning. In this approach, one wishes to sort all possible features using some predefined criteria and select only the “best” ones for the task at hand. It thus may be possible to significantly reduce model dimensions without impeding the performance of the learning algorithm. In some cases one may even gain in generalization power by filtering irrelevant features (cf. [1]). The need for a good feature selection technique also stem from the practical concern that estimating the joint distribution (between the classes and the feature vectors), when either the dimensionality of the feature space or the number of classes is very large, requires impractical large training sets. Indeed, increasing the number of features while keeping the number of samples fixed can actually lead to decrease in the accuracy of the classifier [2, 3].

In this paper, we present a novel method for feature selection based on a variable memory Markov (VMM) model [4]. For a large variety of sequential data, statistical correlations decrease rapidly with the distance between symbols in the sequence. In particular, consider the conditional (empirical) probability distribution on the next symbol given its preceding subsequence. If the statistical correlations are indeed decreasing, then there exists a length L (the memory length) such that the ab

Data Loading...

Discriminative Feature Selection via Multiclass Variable Memory Markov Model

Recommend Documents

Robust multiview feature selection via view weighted

Random Effect and Latent Variable Model Selection

Model selection via information criteria

Variable Selection

Variable Selection

Variable Selection

Feature Subset Selection Using Sparse Principal Component Analysis and Multiclass Fault Classification Using Selected Fe

Discriminative Feature Selection for Multiple Ocular Diseases Classification by Sparse Induced Graph Regularized Group L

Variable Selection and Estimation in Kink Regression Model

Discriminative feature extraction for video person re-identification via multi-task network

Multi-label feature selection via feature manifold learning and sparsity regularization

Statistical model for reproducibility in ranking-based feature selection