A Copula-Based GLMM Model for Multivariate Longitudinal Data with Mixed-Types of Responses

  • PDF / 873,805 Bytes
  • 27 Pages / 439.37 x 666.142 pts Page_size
  • 85 Downloads / 186 Views

DOWNLOAD

REPORT


A Copula-Based GLMM Model for Multivariate Longitudinal Data with Mixed-Types of Responses Weiping Zhang, MengMeng Zhang and Yu Chen University of Science and Technology of China, Hefei, China Abstract We propose a copula-based generalized linear mixed model (GLMM) to jointly analyze multivariate longitudinal data with mixed types, including continuous, count and binary responses. The association of repeated measurements is modelled through the GLMM model, meanwhile a pair-copula construction (D-vine) is adopted to measure the dependency structure between different responses. By combining mixed models and D-vine copulas, our proposed approach could not only deal with unbalanced data with arbitrary margins but also handle moderate dimensional problems due to the efficiency and flexibility of D-vines. Based on D-vine copulas, algorithms for sampling mixed data and computing likelihood are also developed. Leaving the random effects distribution unspecified, we use nonparametric maximum likelihood for model fitting. Then an E-M algorithm is used to obtain the maximum likelihood estimates of parameters. Both simulations and real data analysis show that the nonparametric models are more efficient and flexible than the parametric models. AMS (2000) subject classification. Primary 62G05; Secondary 62J12. Keywords and phrases. Longitudinal data, Mixed types, Joint estimate, Dvine copula, Nonparametric maximum likelihood, E-M algorithm

1 Introduction Multivariatelongitudinal data analyses have attracted increasing interest in many fields, including health, social and behavioural sciences, as they allow the researcher to study the joint evolution of multiple outcomes over time and understand the relationship between different responses. As repeated observations on any given response are likely to be correlated over time while multiple responses measured at a given time point will also be correlated, it is crucial to properly model the association among different responses and association among the multiple outcomes of each response. A proper correlation structure or joint distribution is introduced to mimic such

2

W. Zhang et al.

associations, see Bandyopadhyay et al. (2011) & Verbeke et al. (2014) for a review. It could be more challenging when outcomes are of mixed types, including continuous, count and binary responses. By assuming working correlation structure among repeated measurements, the generalized estimating equation (GEE) proposed by Liang and Zeger (1986) has been used in analyzing multivariate longitudinal data with mixed types, see Zeger and Liang (1991), Rochon (1996), and Cho (2016). Although the GEE yields a consistent estimator for the regression parameter even under the incorrect correlation structure, the estimator can be inefficient (Wang and Carey, 2003). Contrary to the GEE method, Fieuws et al. (2007) and Jaffa et al. (2016) extended the random-effects models (Laird and Ware, 1982) to a multivariate framework and constructed a full likelihood function for all outcomes by specifying a joint distribution for the rand