Federated Learning for Healthcare Informatics

  • PDF / 958,767 Bytes
  • 19 Pages / 439.642 x 666.49 pts Page_size
  • 114 Downloads / 262 Views

DOWNLOAD

REPORT


Federated Learning for Healthcare Informatics Jie Xu1 · Benjamin S. Glicksberg2 · Chang Su1 · Peter Walker3 · Jiang Bian4 · Fei Wang1 Received: 19 August 2020 / Revised: 21 October 2020 / Accepted: 30 October 2020 / © Springer Nature Switzerland AG 2020

Abstract With the rapid development of computer software and hardware technologies, more and more healthcare data are becoming readily available from clinical institutions, patients, insurance companies, and pharmaceutical industries, among others. This access provides an unprecedented opportunity for data science technologies to derive data-driven insights and improve the quality of care delivery. Healthcare data, however, are usually fragmented and private making it difficult to generate robust results across populations. For example, different hospitals own the electronic health records (EHR) of different patient populations and these records are difficult to share across hospitals because of their sensitive nature. This creates a big barrier for developing effective analytical approaches that are generalizable, which need diverse, “big data.” Federated learning, a mechanism of training a shared global model with a central server while keeping all the sensitive data in local institutions where the data belong, provides great promise to connect the fragmented healthcare data sources with privacy-preservation. The goal of this survey is to provide a review for federated learning technologies, particularly within the biomedical space. In particular, we summarize the general solutions to the statistical challenges, system challenges, and privacy issues in federated learning, and point out the implications and potentials in healthcare. Keywords Federated learning · Healthcare · Privacy  Fei Wang

[email protected] 1

Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA

2

Institute for Digital Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA

3

U.S. Department of Defense Joint Artificial Intelligence Center, Washington, D.C., USA

4

Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA

Journal of Healthcare Informatics Research

1 Introduction The recent years have witnessed a surge of interest related to healthcare data analytics, due to the fact that more and more such data are becoming readily available from various sources including clinical institutions, patient individuals, insurance companies, and pharmaceutical industries, among others. This provides an unprecedented opportunity for the development of computational techniques to dig data-driven insights for improving the quality of care delivery [72, 105]. Healthcare data are typically fragmented because of the complicated nature of the healthcare system and processes. For example, different hospitals may be able to access the clinical records of their own patient populations only. These records are highly sensitive with protected health information (PHI) of individuals. Rigorous r