Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surpri

  • PDF / 1,718,665 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 22 Downloads / 143 Views

DOWNLOAD

REPORT


(2020) 20:244

TECHNICAL ADVANCE

Open Access

Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise Zad Rafi1*

and Sander Greenland2

Abstract Background: Researchers often misinterpret and misrepresent statistical outputs. This abuse has led to a large literature on modification or replacement of testing thresholds and P-values with confidence intervals, Bayes factors, and other devices. Because the core problems appear cognitive rather than statistical, we review some simple methods to aid researchers in interpreting statistical outputs. These methods emphasize logical and information concepts over probability, and thus may be more robust to common misinterpretations than are traditional descriptions. Methods: We use the Shannon transform of the P-value p, also known as the binary surprisal or S-value s = −log2(p), to provide a measure of the information supplied by the testing procedure, and to help calibrate intuitions against simple physical experiments like coin tossing. We also use tables or graphs of test statistics for alternative hypotheses, and interval estimates for different percentile levels, to thwart fallacies arising from arbitrary dichotomies. Finally, we reinterpret P-values and interval estimates in unconditional terms, which describe compatibility of data with the entire set of analysis assumptions. We illustrate these methods with a reanalysis of data from an existing record-based cohort study. Conclusions: In line with other recent recommendations, we advise that teaching materials and research reports discuss P-values as measures of compatibility rather than significance, compute P-values for alternative hypotheses whenever they are computed for null hypotheses, and interpret interval estimates as showing values of high compatibility with data, rather than regions of confidence. Our recommendations emphasize cognitive devices for displaying the compatibility of the observed data with various hypotheses of interest, rather than focusing on single hypothesis tests or interval estimates. We believe these simple reforms are well worth the minor effort they require. Keywords: Confidence intervals, Cognitive science, Bias, Data interpretation, Evidence, Hypothesis tests, Information, P-values, Statistical significance, Models, statistical

* Correspondence: [email protected] Available Code: The R scripts to reproduce the graphs in the text can be obtained at https://osf.io/6w8g9/ 1 Department of Population Health, NYU Langone Medical Center, 227 East 30th Street, New York, NY 10016, USA Full list of author information is available at the end of the article © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were m