Bayesian latent factor regression for multivariate functional data with variable selection

  • PDF / 633,316 Bytes
  • 23 Pages / 439.37 x 666.142 pts Page_size
  • 10 Downloads / 225 Views

DOWNLOAD

REPORT


Online ISSN 2005-2863 Print ISSN 1226-3192

RESEARCH ARTICLE

Bayesian latent factor regression for multivariate functional data with variable selection Heesang Noh1 · Taeryon Choi2 · Jinsu Park1 · Yeonseung Chung1 Received: 13 May 2019 / Accepted: 10 December 2019 © Korean Statistical Society 2020

Abstract In biomedical research, multivariate functional data are frequently encountered. Majority of the existing approaches for functional data analysis focus on univariate functional data and the methodology for multivariate functional data is far less studied. Particularly, the problem of investigating covariate effects on multivariate functional data has received little attention. In this research, we propose a fully Bayesian latent factor regression for studying covariate effects on multivariate functional data. The proposed model obtains a low-dimensional representation of multivariate functional data through basis expansions for splines and factor analysis for the basis coefficients. Then, the latent factors specific to each functional outcome are regressed onto covariates accounting for residual correlations among multiple outcomes. The assessment of covariate effects is conducted based on the marginal inclusion probability for each covariate, which is calculated a posteriori by assigning a stochastic search variable selection (SSVS) prior to the regression coefficients. To better control for the false discovery rate, we propose a multivariate SSVS prior that allows for a set of coefficients to be zero simultaneously. We illustrate the proposed method through a simulation study and an application to the air pollution data collected for 13 cities in China. Keywords Multivariate functional data · Bayesian latent factor regression · Basis functions for splines · Stochastic search variable selection · Multiplicative gamma process shrinkage

Electronic supplementary material The online version of this article (https://doi.org/10.1007/s42952019-00044-6) contains supplementary material, which is available to authorized users.

B

Yeonseung Chung [email protected]

1

Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea

2

Department of Statistics, Korea University, Seoul, South Korea

123

Journal of the Korean Statistical Society

1 Introduction Multivariate functional data are frequently encountered in biomedical research. Multivariate functional data are typically defined on a continuous domain but data are sampled on a discrete grid, which may be dense or sparse and regular or irregular for different subjects and different outcomes. There is rich literature on statistical methods for functional data analysis (Ramsay 2005; Ramsay and Silverman 2007) with different inferential focuses such as estimating the mean or individual trajectories, functional clustering and classification, and functional regression (Yao and Lee 2006; Ray and Mallick 2006; Rodriguez and Dunson 2014; Morris 2015; Suarez and Ghosal 2016; Wang et al. 2016). Majority of the existing appro