Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data

PDF / 1,603,419 Bytes
28 Pages / 439.37 x 666.142 pts Page_size
73 Downloads / 254 Views

Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data Yuan Jin1 · Ming Liu2 · Yunfeng Li3 · Ruohua Xu3 · Lan Du1 · Longxiang Gao2 · Yong Xiang2 Received: 29 February 2020 / Accepted: 3 November 2020 © The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2020

Abstract Non-negative tensor factorization models enable predictive analysis on count data. Among them, Bayesian Poisson–Gamma models can derive full posterior distributions of latent factors and are less sensitive to sparse count data. However, current inference methods for these Bayesian models adopt restricted update rules for the posterior parameters. They also fail to share the update information to better cope with the data sparsity. Moreover, these models are not endowed with a component that handles the imbalance in count data values. In this paper, we propose a novel variational auto-encoder framework called VAE-BPTF which addresses the above issues. It uses multi-layer perceptron networks to encode and share complex update information. The encoded information is then reweighted per data instance to penalize common data values before aggregated to compute the posterior parameters for the latent factors. Under synthetic data evaluation, VAE-BPTF tended to recover the right number of latent factors and posterior parameter values. It also outperformed current models in both reconstruction errors and latent factor (semantic) coherence across five real-world datasets. Furthermore, the latent factors inferred by VAE-BPTF are perceived to be meaningful and coherent under a qualitative analysis. Keywords Non-negative tensor factorization · Variational auto-encoders · Neural networks · Latent variable modelling · Count data

Responsible editor: Sriraam Natarajan.

B

Yuan Jin [email protected]

1

Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia

2

School of Information Technology, Deakin University, Melbourne, VIC 3125, Australia

3

Sandstone Pty Ltd, 32-42 Barker St, Kingsford 2032, NSW, Australia

123

Y. Jin et al.

1 Introduction In this paper, we focus on improving the performance of Bayesian Poisson tensor factorization (BPTF). In terms of BPTF, it imposes Gamma distributions as priors over its latent factors. These factors then form the instance-wise rates for a Poisson likelihood over data observations. BPTF adopts two types of inference frameworks to compute the posterior shape and rate for its Gamma latent factors: Gibbs sampling and variational inference. Both of them rely on the auxiliary variable augmentation technique to facilitate their computation. This technique is based on the Poisson– Gamma conjugacy. It exploits the fact that a sum of auxiliary Poisson variables with respective rates is itself a Poisson with the rate equal to the sum of the auxiliaries’ rates. Despite its importance, the augmentation technique, however, increases the computation overhead due to the additional sampling procedures/upd

Data Loading...

Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data

Recommend Documents

Scalable Bayesian Non-negative Tensor Factorization for Massive Count Data

Adaptive Sparse Bayesian Regression with Variational Inference for Parameter Estimation

MRTensorCube: tensor factorization with data reduction for context-aware recommendations

The Separation of Vibration Components Based on Sparse Nonnegative Tensor Factorization

CoGAPS 3: Bayesian non-negative matrix factorization for single-cell analysis with asynchronous updates and sparse data

A Sparse Bayesian Learning Algorithm for Longitudinal Image Data

Research on denoising sparse autoencoder

Related Work on Tensor Factorization

A faster tensor robust PCA via tensor factorization

Nonparametric Bayesian Nonnegative Matrix Factorization

Matrix and Tensor Factorization Techniques for Recommender Systems

A Variational Autoencoder Approach for Speech Signal Separation