Statistical inference for multivariate longitudinal data with irregular auto-correlated error process

  • PDF / 335,028 Bytes
  • 20 Pages / 612 x 792 pts (letter) Page_size
  • 43 Downloads / 196 Views

DOWNLOAD

REPORT


. ARTICLES .

https://doi.org/10.1007/s11425-018-9466-8

Statistical inference for multivariate longitudinal data with irregular auto-correlated error process Youquan Pei1,∗ , Yiming Tang2 & Tao Huang2

2School

1School of Economics, Shandong University, Jinan 250100, China; of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China

Email: [email protected], [email protected], [email protected] Received March 21, 2018; accepted December 4, 2018

Abstract

Multivariate longitudinal data arise frequently in a variety of applications, where multiple outcomes

are measured repeatedly from the same subject. In this paper, we first propose a two-stage weighted least square estimation procedure for the regression coefficients when the random error follows an irregular autoregressive (AR) process, and establish asymptotic normality properties for the resulting estimators. We then apply the smoothly clipped absolute deviation (SCAD) variable selection approach to determine the order of the AR error process. We further propose a test statistic to check whether multiple responses are correlated at the same observation time, and derive the asymptotic distribution of the proposed test statistic. Several simulated examples and real data analysis are presented to illustrate the finite-sample performance of the proposed method. Keywords

multivariate longitudinal data, autoregressive error, two-stage weighted least square, hypothesis

testing MSC(2010)

62F03, 62F05, 62F86

Citation: Pei Y Q, Tang Y M, Huang T. Statistical inference for multivariate longitudinal data with irregular auto-correlated error process. Sci China Math, 2020, 63, https://doi.org/10.1007/s11425-018-9466-8

1

Introduction

Longitudinal data arise frequently in several areas of scientific research, and a variety of statistical models have been proposed in the last few decades for analyzing such data. However, most of these models are confined to the analysis of univariate longitudinal data (see [3, 4, 7, 9]). In practice, multivariate longitudinal data can arise when a set of different outcomes of the same unit is measured repeatedly over time. For example, in a data set about the quality of paper making [12], several physical characteristics of the paper, including the tensile index (ng/g), burst index (kPa m2 /g), tear index (nN m2 /g), and drainability of pulp (Schopper-Riegler (SR) number) were repeatedly measured at beating times of 5, 15, 30, 45 and 60 minutes for 48 batches of pine sulfate pulp. Instead of modeling each longitudinal response variable separately, it is natural and important to model multivariate longitudinal responses simultaneously. Practically, it provides a unique opportunity for one to study the joint evolution of various responses over time. Numerically, it may improve the estimation efficiency by incorporating the correlation information between various responses. * Corresponding author c Science China Press and Springer-Verlag GmbH Germany, part of Springer Nature 2020 ⃝

math.scic