Fast Nonnegative Matrix Factorization and Its Application for Protein Fold Recognition

  • PDF / 367,857 Bytes
  • 8 Pages / 600.03 x 792 pts Page_size
  • 20 Downloads / 187 Views

DOWNLOAD

REPORT


Fast Nonnegative Matrix Factorization and Its Application for Protein Fold Recognition Oleg Okun and Helen Priisalu Machine Vision Group, Infotech Oulu and Department of Electrical and Information Engineering, University of Oulu, P.O. Box 4500, 90014, Finland Received 27 April 2005; Revised 29 September 2005; Accepted 8 December 2005 Linear and unsupervised dimensionality reduction via matrix factorization with nonnegativity constraints is studied. Because of these constraints, it stands apart from other linear dimensionality reduction methods. Here we explore nonnegative matrix factorization in combination with three nearest-neighbor classifiers for protein fold recognition. Since typically matrix factorization is iteratively done, convergence, can be slow. To speed up convergence, we perform feature scaling (normalization) prior to the beginning of iterations. This results in a significantly (more than 11 times) faster algorithm. Justification of why it happens is provided. Another modification of the standard nonnegative matrix factorization algorithm is concerned with combining two known techniques for mapping unseen data. This operation is typically necessary before classifying the data in low-dimensional space. Combining two mapping techniques can yield better accuracy than using either technique alone. The gains, however, depend on the state of the random number generator used for initialization of iterations, a classifier, and its parameters. In particular, when employing the best out of three classifiers and reducing the original dimensionality by around 30%, these gains can reach more than 4%, compared to the classification in the original, high-dimensional space. Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1.

INTRODUCTION

It is not uncommon that for certain data sets, their dimensionality n is higher than the number of attributes or features m (here and further it is assumed that the data are accumulated in an n × m matrix accomodating m n-dimensional feature vectors). In such cases, the effect, referred to as curse of dimensionality, occurs that negatively influences the clustering and classification of a given data set. Dimensionality reduction is typically used to cure or at least to mitigate this effect and can be done by means of feature extraction (FE) or feature selection (FS). FS selects a subset of the original features based on a certain criterion of feature importance or relevance whereas FE produces a set of transformed (i.e., new) features from the original ones. Features chosen by FS are easy to interpret while those found by FE may be not. In addition, FS often assumes knowledge of class membership information, that is, it is often supervised, in contrast to FE that is usually unsupervised. Thus, FS looks naturally more attractive than FE from the classification viewpoint, that is, when FS is followed by classification of a data set. However, there can be the cases where all or almost all original features turn out, to be important (relevant) so that FS becomes ina