Comparison of Non-negative Matrix Factorization Methods for Clustering Genomic Data

Non-negative matrix factorization (NMF) is a useful method of data dimensionality reduction and has been widely used in many fields, such as pattern recognition and data mining. Compared with other traditional methods, it has unique advantages. And more a

PDF / 190,101 Bytes
10 Pages / 439.37 x 666.142 pts Page_size
65 Downloads / 159 Views

DOWNLOAD

REPORT

2

3

School of Information Science and Engineering, Qufu Normal University, Rizhao 276826, China {mixiaohou,shangjunliang110}@163.com, {sdcavell,zhengch99}@126.com Library of Qufu Normal University, Qufu Normal University, Rizhao 276826, China [email protected] Shenzhen Graduate School, Bio-Computing Research Center, Harbin Institute of Technology, Shenzhen 518055, China

Abstract. Non-negative matrix factorization (NMF) is a useful method of data dimensionality reduction and has been widely used in many ﬁelds, such as pattern recognition and data mining. Compared with other traditional methods, it has unique advantages. And more and more improved NMF methods have been provided in recent years and all of these methods have merits and demerits when used in different applications. Clustering based on NMF methods is a common way to reflect the properties of methods. While there are no special comparisons of clustering experiments based on NMF methods on genomic data. In this paper, we analyze the characteristics of basic NMF and its classical variant methods. Moreover, we show the clustering results based on the coefﬁcient matrix decomposed by NMF methods on the genomic datasets. We also compare the clustering accuracies and the cost of time of these methods. Keywords: Non-negative matrix factorization Dimensionality reduction

Clustering Genomic

data

1 Introduction With human’s entering the era of big data, massive and high-dimensional data seem to be generated continuously. It is a challenge to reduce the dimensionality of high-dimensional data to achieve the purpose of storing, processing and reconstructing in machine learning and data mining. There are numerous traditional methods to reduce the dimensionality of data, such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). These methods allow for the existence of negative, which is not applicable in some cases. And they adopted linear dimensionality reduction technology that is not conducive to retaining characteristics of data. As a novel matrix factorization method, NMF overcomes many problems of the traditional © Springer International Publishing Switzerland 2016 D.-S. Huang and K.-H. Jo (Eds.): ICIC 2016, Part II, LNCS 9772, pp. 290–299, 2016. DOI: 10.1007/978-3-319-42294-7_25

Comparison of Non-negative Matrix Factorization Methods

291

matrix factorization method and provides a deeper view of the data. NMF can obtain two non-negative matrices to approximate the original data matrix, which reflects the concept of part-based representation in human thought. NMF method can get the local expression of high-dimensional data by dimensionality reduction. It has been successfully used in bioinformatics, such as genome sequence feature recognition, local feature recognition, biological literature mining. In recent years, many scholars utilized NMF methods to do clustering experiments, such as document clustering, image clustering, tumor clustering. But there are no clearly comparisons of clustering experiments on genomic data,

Data Loading...

Comparison of Non-negative Matrix Factorization Methods for Clustering Genomic Data

Recommend Documents

Nonparametric Bayesian Nonnegative Matrix Factorization

Nonnegative Residual Matrix Factorization for Community Detection

Randomized Algorithms for Orthogonal Nonnegative Matrix Factorization

Bayesian mean-parameterized nonnegative binary matrix factorization

Dual local learning regularized nonnegative matrix factorization and its semi-supervised extension for clustering

Semantic Feature Extraction for Brain CT Image Clustering Using Nonnegative Matrix Factorization

Matrix factorization of large scale data using multistage matrix factorization

Element-Wise Alternating Least Squares Algorithm for Nonnegative Matrix Factorization on One-Hot Encoded Data

Deep semi-nonnegative matrix factorization with elastic preserving for data representation

MHSNMF: multi-view hessian regularization based symmetric nonnegative matrix factorization for microbiome data analysis

Constrained nonnegative matrix factorization-based semi-supervised multilabel learning

Nonnegative matrix factorization with manifold regularization and maximum discriminant information