Understanding the temporal evolution of COVID-19 research through machine learning and natural language processing
- PDF / 3,597,485 Bytes
- 15 Pages / 439.37 x 666.142 pts Page_size
- 55 Downloads / 159 Views
Understanding the temporal evolution of COVID‑19 research through machine learning and natural language processing Ashkan Ebadi1,4 · Pengcheng Xi2 · Stéphane Tremblay2 · Bruce Spencer3,5 · Raman Pall2 · Alexander Wong6,7 Received: 12 July 2020 © Crown 2020
Abstract The outbreak of the novel coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been continuously affecting human lives and communities around the world in many ways, from cities under lockdown to new social experiences. Although in most cases COVID-19 results in mild illness, it has drawn global attention due to the extremely contagious nature of SARS-CoV-2. Governments and healthcare professionals, along with people and society as a whole, have taken any measures to break the chain of transition and flatten the epidemic curve. In this study, we used multiple data sources, i.e., PubMed and ArXiv, and built several machine learning models to characterize the landscape of current COVID-19 research by identifying the latent topics and analyzing the temporal evolution of the extracted research themes, publications similarity, and sentiments, within the time-frame of January–May 2020. Our findings confirm the types of research available in PubMed and ArXiv differ significantly, with the former exhibiting greater diversity in terms of COVID-19 related issues and the latter focusing more on intelligent systems/tools to predict/diagnose COVID-19. The special attention of the research community to the high-risk groups and people with complications was also confirmed. Keywords COVID-19 research landscape · Topics evolution · Machine learning · Structural topic modeling · Text mining
* Ashkan Ebadi ashkan.ebadi@nrc‑cnrc.gc.ca 1
National Research Council Canada, Montréal, QC H3T 1J4, Canada
2
National Research Council Canada, Ottawa, ON K1K 2E1, Canada
3
National Research Council Canada, Fredericton, NB E3B 9W4, Canada
4
Concordia Institute for Information Systems Engineering, Concordia University, Montréal, QC H3G 2W1, Canada
5
Faculty of Computer Science, University of New Brunswick, Fredericton, NB E3B 5A3, Canada
6
Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada
7
Waterloo Artificial Intelligence Institute, Waterloo, ON N2L 3G1, Canada
13
Vol.:(0123456789)
Scientometrics
Introduction The ongoing pandemic of the coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has been affecting human lives and communities around the world, causing global social and economic disruption (International Monetary Fund 2020). The first case of COVID-19 can be traced back to Wuhan (China) in December 2019 (Hui et al. 2020). The World Health Organization (WHO) declared the outbreak in January 2020 and characterized it as a pandemic in March 2020 (World Health Organization 2020). As of June 2020, more than 6.5 million COVID-19 cases have been reported worldwide resulting in more than 500,00
Data Loading...