Bayesian mean-parameterized nonnegative binary matrix factorization

PDF / 5,030,850 Bytes
38 Pages / 439.37 x 666.142 pts Page_size
62 Downloads / 297 Views

Bayesian mean-parameterized nonnegative binary matrix factorization Alberto Lumbreras1

· Louis Filstroﬀ2 · Cédric Févotte2

Received: 17 December 2018 / Accepted: 17 August 2020 © The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2020

Abstract Binary data matrices can represent many types of data such as social networks, votes, or gene expression. In some cases, the analysis of binary matrices can be tackled with nonnegative matrix factorization (NMF), where the observed data matrix is approximated by the product of two smaller nonnegative matrices. In this context, probabilistic NMF assumes a generative model where the data is usually Bernoulli-distributed. Often, a link function is used to map the factorization to the [0, 1] range, ensuring a valid Bernoulli mean parameter. However, link functions have the potential disadvantage to lead to uninterpretable models. Mean-parameterized NMF, on the contrary, overcomes this problem. We propose a unified framework for Bayesian mean-parameterized nonnegative binary matrix factorization models (NBMF). We analyze three models which correspond to three possible constraints that respect the mean-parameterization without the need for link functions. Furthermore, we derive a novel collapsed Gibbs sampler and a collapsed variational algorithm to infer the posterior distribution of the factors. Next, we extend the proposed models to a nonparametric setting where the number of used latent dimensions is automatically driven by the observed data. We analyze the performance of our NBMF methods in multiple datasets for different tasks such as dictionary learning and prediction of missing data. Experiments show that our methods provide similar or superior results than the state of the art, while automatically detecting the number of relevant components. Keywords Matrix factorization · Latent variable models · Bayesian inference · Binary data

Responsible editor: Pauli Miettinen.

B

Alberto Lumbreras [email protected]

Extended author information available on the last page of the article

123

A. Lumbreras et al.

1 Introduction Nonnegative matrix factorization (NMF) is a family of methods that approximate a nonnegative matrix V of size F × N as the product of two nonnegative matrices, V ≈ WH,

(1)

where W has size F × K , and H has size K × N , often referred to as the dictionary and the activation matrix, respectively. K is usually chosen such that F K + K N F N , hence reducing the data dimension. Such an approximation is often sought after by minimizing a measure of fit between the observed data V and its factorized approximation WH, i.e., W, H = arg minW,H D(V|WH) s.t W ≥ 0, H ≥ 0,

(2)

where D denotes the cost function, and where the notation A ≥ 0 denotes nonnegativity of the entries of A. Typical cost functions include the squared Euclidean distance and the generalized Kullback-Leiber divergence (Lee and Seung 2001), the α-divergence (Cichocki et al. 2008) or the β-divergence (Févotte and Idier 2011). Most of the

Data Loading...

Bayesian mean-parameterized nonnegative binary matrix factorization

Recommend Documents

Nonparametric Bayesian Nonnegative Matrix Factorization

Nonnegative Residual Matrix Factorization for Community Detection

Randomized Algorithms for Orthogonal Nonnegative Matrix Factorization

Constrained nonnegative matrix factorization-based semi-supervised multilabel learning

Nonnegative matrix factorization with manifold regularization and maximum discriminant information

Dual-Transform Source Separation Using Sparse Nonnegative Matrix Factorization

FER based on the improved convex nonnegative matrix factorization feature

Fast Nonnegative Matrix Factorization and Its Application for Protein Fold Recognition

Determining Patterns in Neural Activity for Reaching Movements Using Nonnegative Matrix Factorization

Matrix factorization of large scale data using multistage matrix factorization

Gyral Growth Patterns of Macaque Brains Revealed by Scattered Orthogonal Nonnegative Matrix Factorization

Element-Wise Alternating Least Squares Algorithm for Nonnegative Matrix Factorization on One-Hot Encoded Data