Veridical causal inference using propensity score methods for comparative effectiveness research with medical claims

  • PDF / 1,101,030 Bytes
  • 23 Pages / 439.37 x 666.142 pts Page_size
  • 68 Downloads / 219 Views

DOWNLOAD

REPORT


Veridical causal inference using propensity score methods for comparative effectiveness research with medical claims Ryan D. Ross1   · Xu Shi1 · Megan E. V. Caram2,3,4 · Phoebe A. Tsao2 · Paul Lin4 · Amy Bohnert3,4,5 · Min Zhang1 · Bhramar Mukherjee1 Received: 29 April 2020 / Revised: 19 September 2020 / Accepted: 9 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Medical insurance claims are becoming increasingly common data sources to answer a variety of questions in biomedical research. Although comprehensive in terms of longitudinal characterization of disease development and progression for a potentially large number of patients, population-based inference using these datasets require thoughtful modifications to sample selection and analytic strategies relative to other types of studies. Along with complex selection bias and missing data issues, claims-based studies are purely observational, which limits effective understanding and characterization of the treatment differences between groups being compared. All these issues contribute to a crisis in reproducibility and replication of comparative findings using medical claims. This paper offers practical guidance to the analytical process, demonstrates methods for estimating causal treatment effects with propensity score methods for several types of outcomes common to such studies, such as binary, count, time to event and longitudinally varying measures, and also aims to increase transparency and reproducibility of reporting of results from these investigations. We provide an online version of the paper with readily implementable code for the entire analysis pipeline to serve as a guided tutorial for practitioners. The online version can be accessed at https​://rydar​o.githu​b.io/. The analytic pipeline is illustrated using a sub-cohort of patients with advanced prostate cancer from the large Clinformatics TM Data Mart Database (OptumInsight, Eden Prairie, Minnesota), consisting of 73 million distinct private payer insures from 2001 to 2016. Keywords  Average treatment effect · Covariate adjustment · Insurance claims · Hormone therapy · Matching · Prostate cancer · Reproducibility · Sensitivity analysis · Veridical data science

Electronic supplementary material  The online version of this article (https​://doi.org/10.1007/s1074​ 2-020-00222​-8) contains supplementary material, which is available to authorized users. * Ryan D. Ross [email protected] Extended author information available on the last page of the article

13

Vol.:(0123456789)



Health Services and Outcomes Research Methodology

1 Introduction and background Health service billing data can be used to answer many clinical and epidemiological questions using a large number of patients and has the potential to capture patterns in health care practice that take place in the real world (Sherman et al. 2016; Izurieta et al. 2019; Noe et al. 2019; Nidey et al. 2020; O’Neal et al. 2018). Such large datasets allow investigators to conduct scientific queries whic