Using Multiple Imputation with GEE with Non-monotone Missing Longitudinal Binary Outcomes

  • PDF / 336,345 Bytes
  • 15 Pages / 547.087 x 737.008 pts Page_size
  • 71 Downloads / 257 Views

DOWNLOAD

REPORT


USING MULTIPLE IMPUTATION WITH GEE WITH NON-MONOTONE MISSING LONGITUDINAL BINARY OUTCOMES

Stuart R. Lipsitz BRIGHAM AND WOMEN’S HOSPITAL AND ARIADNE LABS

Garrett M. Fitzmaurice and Roger D. Weiss MCLEAN HOSPITAL

This paper considers multiple imputation (MI) approaches for handling non-monotone missing longitudinal binary responses when estimating parameters of a marginal model using generalized estimating equations (GEE). GEE has been shown to yield consistent estimates of the regression parameters for a marginal model when data are missing completely at random (MCAR). However, when data are missing at random (MAR), the GEE estimates may not be consistent; the MI approaches proposed in this paper minimize bias under MAR. The first MI approach proposed is based on a multivariate normal distribution, but with the addition of pairwise products among the binary outcomes to the multivariate normal vector. Even though the multivariate normal does not impute 0 or 1 values for the missing binary responses, as discussed by Horton et al. (Am Stat 57:229–232, 2003), we suggest not rounding when filling in the missing binary data because it could increase bias. The second MI approach considered is the fully conditional specification (FCS) approach. In this approach, we specify a logistic regression model for each outcome given the outcomes at other time points and the covariates. Typically, one would only include main effects of the outcome at the other times as predictors in the FCS approach, but we explore if bias can be reduced by also including pairwise interactions of the outcomes at other time point in the FCS. In a study of asymptotic bias with non-monotone missing data, the proposed MI approaches are also compared to GEE without imputation. Finally, the proposed methods are illustrated using data from a longitudinal clinical trial comparing four psychosocial treatments from the National Institute on Drug Abuse Collaborative Cocaine Treatment Study, where patients’ cocaine use is collected monthly for 6 months during treatment. Key words: fully conditional specification, generalized estimating equations, missing completely at random, missing at random, multivariate normal.

1. Introduction Longitudinal studies in which each subject is to be observed at a fixed number of times are common in medicine. In this paper, we consider statistical methods for the analysis of such data when the outcome is binary (e.g., success or failure) and the missing data pattern is not monotone; for example, a subject’s outcome variable can be observed at one time point, missing at the next time point, and then observed at later time point(s). Our motivating example is a longitudinal clinical trial from the National Institute on Drug Abuse Collaborative Cocaine Treatment Study (CCTS) whose goal was to compare four psychosocial treatments to reduce cocaine use in patients with cocaine dependence (Crits-Christoph 1999). Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11336-02009729-y) contains s