COVID-19 open source data sets: a comprehensive survey

  • PDF / 1,788,346 Bytes
  • 30 Pages / 595.224 x 790.955 pts Page_size
  • 19 Downloads / 236 Views

DOWNLOAD

REPORT


COVID-19 open source data sets: a comprehensive survey Junaid Shuja1,3,4

· Eisa Alanazi2,4 · Waleed Alasmary3,4 · Abdulaziz Alashaikh5

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In December 2019, a novel virus named COVID-19 emerged in the city of Wuhan, China. In early 2020, the COVID-19 virus spread in all continents of the world except Antarctica, causing widespread infections and deaths due to its contagious characteristics and no medically proven treatment. The COVID-19 pandemic has been termed as the most consequential global crisis since the World Wars. The first line of defense against the COVID-19 spread are the non-pharmaceutical measures like social distancing and personal hygiene. The great pandemic affecting billions of lives economically and socially has motivated the scientific community to come up with solutions based on computer-aided digital technologies for diagnosis, prevention, and estimation of COVID-19. Some of these efforts focus on statistical and Artificial Intelligencebased analysis of the available data concerning COVID-19. All of these scientific efforts necessitate that the data brought to service for the analysis should be open source to promote the extension, validation, and collaboration of the work in the fight against the global pandemic. Our survey is motivated by the open source efforts that can be mainly categorized as (a) COVID-19 diagnosis from CT scans, X-ray images, and cough sounds, (b) COVID-19 case reporting, transmission estimation, and prognosis from epidemiological, demographic, and mobility data, (c) COVID-19 emotional and sentiment analysis from social media, and (d) knowledge-based discovery and semantic analysis from the collection of scholarly articles covering COVID-19. We survey and compare research works in these directions that are accompanied by open source data and code. Future research directions for data-driven COVID-19 research are also debated. We hope that the article will provide the scientific community with an initiative to start open source extensible and transparent research in the collective fight against the COVID-19 pandemic. Keywords COVID-19 · Coronavirus · Pandemic · Machine learning · Artificial intelligence · Open source · Data sets

1 Introduction This article belongs to the Topical Collection: Artificial Intelligence Applications for COVID-19, Detection, Control, Prediction, and Diagnosis  Junaid Shuja

[email protected] 1

Department of Computer Science, COMSATS University Islamabad, Abbottabad Campus, Islamabad, Pakistan

2

Department of Computer Science, Umm Al-Qura University, Makkah, Saudi Arabia

3

Department of Computer Engineering, Umm Al-Qura University, Makkah, Saudi Arabia

4

Center of Innovation and Development in Artificial Intelligence, Umm Al-Qura University, Makkah, Saudi Arabia

5

Computer Engineering and Networks Department, University of Jeddah, Jeddah, Saudi Arabia

The COVID-19 virus has been declared a pandemic by the World Health Organization (WHO) with more than t