Statistical Guideline #4. Describe the Nature and Extent of Missing Data and Impute Where Possible and Prudent

PDF / 165,413 Bytes
2 Pages / 595.276 x 790.866 pts Page_size
96 Downloads / 255 Views

INTEGRATIVE REVIEW

Statistical Guideline #4. Describe the Nature and Extent of Missing Data and Impute Where Possible and Prudent Suzanne C. Segerstrom 1

# International Society of Behavioral Medicine 2019

Abstract From the Editors: This is one in a series of statistical guidelines designed to highlight common statistical considerations in behavioral medicine research. The goal is to briefly discuss appropriate ways to analyze and present data in the International Journal of Behavioral Medicine (IJBM). Collectively the series will culminate in a set of basic statistical guidelines to be adopted by IJBM and integrated into the journal’s official Instructions for Authors, but also to serve as an independent resource. If you have ideas for a future topic, please email the Statistical Editor Suzanne Segerstrom at [email protected]. Keywords Missing data . Imputation . Statistical guidelines

The Statistics Guru Unless you are running a simulation study, you are likely to have missing data due to a skipped item or questionnaire page, a scale added after data collection has begun, a study dropout, or equipment failure, for example. The fourth statistical guideline for IJBM is a recommendation for authors to describe the nature and extent of their missing data and to impute missing data (that is, to replace missing data with a feasible value) where imputation is indicated. The canonical question in missing data analysis is, what is the cause of missingness? Data can be missing completely at random (MCAR). For example, equipment might fail, causing a loss of heart rate data. A subset of questionnaires might have been copied incorrectly, leaving out a measure. Because the processes that generated the missing data had nothing to do with the nature of the research participants or their data, MCAR data do not risk biasing the results of analysis. Data can also be missing at random (MAR). For example, older participants might be more likely to drop out of a longitudinal study. In this case, the process that generated the missing data is related to a measured variable in the study. To reduce bias * Suzanne C. Segerstrom [email protected] 1

Department of Psychology, University of Kentucky, 125 Kastle Hall, Lexington, KY 40506-0044, USA

associated with MAR data, data analysis can account for the process by including the measured variable in the model. Data that are not missing at random (NMAR) are the most problematic and yield biased estimates. NMAR data are a function of the data that are missing (e.g., a person with a history of depression leaving questions about psychiatric history blank). Many strategies for handling missing data exist, and both instructional articles [1–3] and book-length treatments are available; a good synopsis of books on missing data can be found at https://thestatsgeek.com/stats-books/missing-databooks/. This guideline cannot summarize all the approaches but suggests some reporting guidelines and possible starting points for handling missing data. Missing Items It is not unusual for a person to s

Data Loading...

Statistical Guideline #4. Describe the Nature and Extent of Missing Data and Impute Where Possible and Prudent

Recommend Documents

Clustering and Regression to Impute Missing Values of Robot Performance

Missing Data Analysis and Design

Statistical Guideline No. 5. Include Results of a Power Analysis; if a Power Analysis Was Not Performed, Describe the St

Reflections on the Nature of Nature and Where We Fit In

Violence Against Older Women, Volume I Nature and Extent

The extent and nature of the strength-differential effect in steels

Causal Inference and Missing Data Problems

Missing Data

Missing Data

Glacier extent changes and possible causes in the Hala Lake Basin of Qinghai-Tibet Plateau

Statistical Guideline #3: Designate and Justify Covariates A Priori, and Report Results With and Without Covariates

I-Impute: a self-consistent method to impute single cell RNA sequencing data