Going on up to the SPIRIT in AI: will new reporting guidelines for clinical trials of AI interventions improve their rig

  • PDF / 442,153 Bytes
  • 3 Pages / 595.276 x 790.866 pts Page_size
  • 46 Downloads / 144 Views

DOWNLOAD

REPORT


COMMENTARY

Open Access

Going on up to the SPIRIT in AI: will new reporting guidelines for clinical trials of AI interventions improve their rigour? Paul Wicks1*, Xiaoxuan Liu2,3,4,5 and Alastair K. Denniston2,3,4,6,7

Keywords: Clinical trial, Machine learning, Artificial intelligence, Reporting guidelines, Checklist

Background In September of 2019, British Prime Minister Boris Johnson posed a dystopian conundrum to the United Nations General Assembly: “AI, what will it mean? Helpful robots washing and caring for an aging population, or pink-eyed terminators sent back from the future to cull the human race?” [1]. Amongst the hyperbole, Johnson posed a question that medicine must address: “Can these algorithms be trusted with our lives and hopes? Should the machines—and only the machines—decide... what surgery or medicines we should receive?... And how do we know that the machines have not been insidiously programmed to fool us or even to cheat us?” Flattening the hype curve in AI

While it has been recognized that AI may have been “overhyped” [2], today AI algorithms are increasingly involved in drug discovery, symptomatic triage, breast cancer screening, predicting acute kidney injury, and even offering mental health support. However, a recent systematic review of over 20,000 medical imaging AI studies found concerning issues of bias, lack of transparency, or inappropriate comparator groups, which meant that < 1% of those studies were of sufficient quality to be considered a trustworthy evaluation of the algorithm [3]. A year after Johnson’s provocation, a global multidisciplinary coalition has convened to address these * Correspondence: [email protected] 1 Wicks Digital Health, Lichfield, Staffordshire, UK Full list of author information is available at the end of the article

shortcomings and take us towards the “plateau of productivity” [2] of the hype cycle for AI by setting new standards that encourage researchers, journals, and funders to open up the black box and establish public trust. Over the course of 18 months, the consortium rigorously developed extensions to two of the most trusted minimum reporting guidelines in medicine: Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) and Consolidated Standards of Reporting Trials (CONSORT). In brief, SPIRIT is the international standard for reporting of protocols of randomized clinical trials—i.e. what you intended to do—and CONSORT is the international standard for reporting of the delivery and results of those trials—i.e. what you actually did. These new recommendations involved a process for systematically gaining consensus from 169 international stakeholders, identifying areas of particular importance involving AI interventions that are not currently covered by the existing guidelines. The SPIRIT-AI and CONSORT-AI checklists contain 15 and 14 new items respectively as extensions to the existing SPIRIT 2013 and CONSORT 2010 checklists. The guidelines include requirements for reporting of areas such as the quality and com