Using machine learning of clinical data to diagnose COVID-19: a systematic review and meta-analysis

  • PDF / 2,052,927 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 60 Downloads / 197 Views

DOWNLOAD

REPORT


(2020) 20:247

RESEARCH ARTICLE

Open Access

Using machine learning of clinical data to diagnose COVID-19: a systematic review and meta-analysis Wei Tse Li1,2†, Jiayan Ma1,2†, Neil Shende1,2†, Grant Castaneda1,2, Jaideep Chakladar1,2, Joseph C. Tsai1,2, Lauren Apostol1,2, Christine O. Honda1,2, Jingyue Xu1,2, Lindsay M. Wong1,2, Tianyi Zhang1,2, Abby Lee1,2, Aditi Gnanasekar1,2, Thomas K. Honda1,2, Selena Z. Kuo3, Michael Andrew Yu4, Eric Y. Chang5,6, Mahadevan “ Raj” Rajasekaran7,8 and Weg M. Ongkeko1,2* Abstract Background: The recent Coronavirus Disease 2019 (COVID-19) pandemic has placed severe stress on healthcare systems worldwide, which is amplified by the critical shortage of COVID-19 tests. Methods: In this study, we propose to generate a more accurate diagnosis model of COVID-19 based on patient symptoms and routine test results by applying machine learning to reanalyzing COVID-19 data from 151 published studies. We aim to investigate correlations between clinical variables, cluster COVID-19 patients into subtypes, and generate a computational classification model for discriminating between COVID-19 patients and influenza patients based on clinical variables alone. Results: We discovered several novel associations between clinical variables, including correlations between being male and having higher levels of serum lymphocytes and neutrophils. We found that COVID-19 patients could be clustered into subtypes based on serum levels of immune cells, gender, and reported symptoms. Finally, we trained an XGBoost model to achieve a sensitivity of 92.5% and a specificity of 97.9% in discriminating COVID-19 patients from influenza patients. Conclusions: We demonstrated that computational methods trained on large clinical datasets could yield ever more accurate COVID-19 diagnostic models to mitigate the impact of lack of testing. We also presented previously unknown COVID-19 clinical variable correlations and clinical subgroups. Keywords: COVID-19, Machine learning, Diagnostic model

* Correspondence: [email protected] † Wei Tse Li, Jiayan Ma and Neil Shende contributed equally to this work. 1 Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, UC San Diego School of Medicine, San Diego, CA 92093, USA 2 Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA Full list of author information is available at the end of the article © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is