Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data

PDF / 568,677 Bytes
30 Pages / 439.37 x 666.142 pts Page_size
53 Downloads / 288 Views

Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data Tianqing Liu1 · Xiaohui Yuan2 Received: 10 December 2017 / Revised: 2 June 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018

Abstract The problem of nonignorable nonresponse data is ubiquitous in medical and social science studies. Analyses focused only on the missing-at-random assumption may lead to biased results. Various debias methods have been extensively studied in the literature, particularly the doubly robust (DR) estimators. We propose DR augmented-estimatingequations (AEE) estimators of the mean response which enjoy the double-robustness property under correct specification of the log odds ratio model. An advantage of DR AEE estimators is that they can efficiently use the completely observed covariates to improve estimation efficiency of existing DR estimators with nonignorable nonresponse data. We propose a model selection criterion that can consistently select the correct parametric model of the log odds ratio model from a group of candidate models. Moreover, the correctness of the required working models can be evaluated via straightforward goodness-of-fit tests. Simulation results indicate that doubly robust augmented-estimating-equations estimators are very robust to a misspecification of the baseline outcome density model or the baseline response model and dominate other competitors in the sense of having smaller mean-square errors. The analysis of a real dataset illustrates the flexibility and usefulness of the proposed methods. Keywords Augmented estimating equations · Doubly robust · Goodness-of-fit tests · Non-ignorable missing data · Nonresponse instrumental variable

B B

Tianqing Liu [email protected] Xiaohui Yuan [email protected]

1

School of Mathematics, Jilin University, Changchun 130012, Jilin, China

2

School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, Jilin, China

123

T. Liu, X. Yuan

1 Introduction Let y denote the outcome of interest which may not be observed for all subjects, w = (x T , z T ) be a l-dimensional vector of covariates which is always observed. Let r be a response indicator of y, i.e., it takes 1 if y is observed, and takes 0 otherwise. In statistic literature, non-ignorable missingness (Little and Rubin 2002) is the most difficult problem, because the response probability pr(r = 1|w, y) depends on y regardless of whether y is observed or missing and the joint distribution of (w, y, r ) cannot be identifiable without further restrictions on the response probability pr(r = 1|w, y). For model identification, throughout, we assume that the fully observed nonresponse instrumental variable z (Wang et al. 2014; Miao and Tchetgen 2016; Choi and Lee 2017) satisfies z⊥ ⊥ y|x and z ⊥ ⊥ r |(y, x).

(1)

Under assumption (1), Miao and Tchetgen (2016) factorized the conditional density function of (z, y, r ) given x as f (z, y, r |x) = c(x) exp{(1 − r )OR(y|x)}pr(r |y = 0, x) f (z, y|r = 1, x),

(2)

where c(x) = pr(r = 1|x)

Data Loading...

Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data

Recommend Documents

Robust doubly protected estimators for quantiles with missing data

Efficient and doubly-robust methods for variable selection and parameter estimation in longitudinal data analysis

Identification of Suspicious Data for Robust Estimation of Stochastic Processes

Analysis of Doubly Truncated Data An Introduction

Robust Estimation of Sparse Signal with Unknown Sparsity Cluster Value

Robust Estimation Procedure for Autoregressive Models with Heterogeneity

Estimation and Direct Equalization of Doubly Selective Channels

An improvement on the efficiency of complete-case-analysis with nonignorable missing covariate data

Robust Data Mining

Geometric Estimation via Robust Subspace Recovery

Robust multivariate density estimation under Gaussian noise

Estimation of a cluster-level regression model under nonresponse within clusters