Privacy-Preserving Deep Learning: Revisited and Enhanced

We build a privacy-preserving deep learning system in which many learning participants perform neural network-based deep learning over a combined dataset of all, without actually revealing the participants’ local data to a curious server. To that end, we

  • PDF / 358,671 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 31 Downloads / 418 Views

DOWNLOAD

REPORT


National Institute of Information and Communications Technology (NICT), Tokyo, Japan {phong,aono,wlh,shiho.moriai}@nict.go.jp 2 Kobe University, Kobe, Japan [email protected]

Abstract. We build a privacy-preserving deep learning system in which many learning participants perform neural network-based deep learning over a combined dataset of all, without actually revealing the participants’ local data to a curious server. To that end, we revisit the previous work by Shokri and Shmatikov (ACM CCS 2015) and point out that local data information may be actually leaked to an honest-but-curious server. We then move on to fix that problem via building an enhanced system with following properties: (1) no information is leaked to the server; and (2) accuracy is kept intact, compared to that of the ordinary deep learning system also over the combined dataset. Our system makes use of additively homomorphic encryption, and we show that our usage of encryption adds little overhead to the ordinary deep learning system. Keywords: Privacy · Deep learning homomorphic encryption

1 1.1

·

Neural network

·

Additively

Introduction Background

In recent years, deep learning (aka, deep machine learning) has produced exciting results in both acamedia and industry, in which deep learning systems are approaching or even surpassing human-level accuracy. This is thanks to algorithmic breakthroughs and physical parallel hardware applied to neural networks when processing massive amount of data. Massive collection of data, while vital for deep learning, raises the issue of privacy. Individually, a collected photo can be permanently kept on a server of a company, out of the control of the photo’s owner. At law, privacy and confidentiality worries may prevent hospitals and research centers from sharing their medical datasets, baring them from enjoying the advantage of large-scale deep learning over the joint datasets. c Springer Nature Singapore Pte Ltd. 2017  L. Batten et al. (Eds.): ATIS 2017, CCIS 719, pp. 100–110, 2017. DOI: 10.1007/978-981-10-5421-1 9

Privacy-Preserving Deep Learning: Revisited and Enhanced

101

As a directly related work, Shokri and Shmatikov [12] presented a system for privacy-preserving deep learning, allowing local datasets of several participants staying home while the learned model for the neural network over the joint dataset can be obtained by the participants. To achieve the result, the system in [12] needs the following: each learning participant, using local data, first computes gradients of a neural network; then a part (e.g. 1%–10%) of those gradients must be sent to a parameter cloud server. The server is honest-butcurious: it is assumed to be curious in extracting the data of individuals; and yet, it is assumed to be honest in operations. To protect privacy, the system of Shokri and Shmatikov admits an accuracy/privacy tradeoff (see Table 1): sharing no local gradients leads to perfect privacy but not desirable accuracy; on the other hand, sharing all local gradients violates privacy but leads