Improving Communication Efficiency for Encrypted Distributed Training
Secure Multi-Party Computation (SMPC) is usually treated as a special encryption way. Unlike most encryption methods using a private or public key to encrypt data, it splits a value into different shares, and each share works like a private key. Only get
- PDF / 5,104,144 Bytes
- 14 Pages / 439.37 x 666.142 pts Page_size
- 59 Downloads / 226 Views
Chengdu 611731, China [email protected] 2 State Key Laboratory of Cryptology, P.O. Box 5159, Beijing 100878, China
Abstract. Secure Multi-Party Computation (SMPC) is usually treated as a special encryption way. Unlike most encryption methods using a private or public key to encrypt data, it splits a value into different shares, and each share works like a private key. Only get all these shares, we can get the original data correctly. In this paper, we utilize SMPC to protect the privacy of gradient updates in distributed learning, where each client computes an update and shares their updates by encrypting them so that no information about the clients’ data can be leaked through the whole computing process. However, encryption brings a sharp increase in communication cost. To improve the training efficiency, we apply gradient sparsification to compress the gradient by sending only the important gradients. In order to improve the accuracy and efficiency of the model, we also make some improvements to the original sparsification algorithm. Extensive experiments show that the amount of data that needs to be transferred is reduced while the model still achieves 99.6% accuracy on the MNIST dataset. Keywords: Distributed training · Secure Multi-Party Computation · Gradient compression
1 Introduction Distributed training enables larger datasets and more complex models [1] which allows multiple participants to train a model, each with its local dataset. Each participant trains a local model on each round and sends the updates to the server to construct the global model. However, recent evidence reveals that private information can be leaked by the process of transferring updates. Notable examples include collecting periodic updates and use Data Representatives (DR) and Generative Adversarial Network (GAN) to recover the original data [1] and inferring unintended feature from a malicious participant [2]. While training a model through different participants, there is inevitable exist malicious attackers, to protect the privacy of each participant, preventive action must be done in advance. Since some prior work has been done to protect the privacy in machine learning [3–7] using Secure Multi-Party Computation, but we notice that the communication cost can be high, since after encryption, larger amounts of data need to be transferred. © Springer Nature Singapore Pte Ltd. 2020 S. Yu et al. (Eds.): SPDE 2020, CCIS 1268, pp. 577–590, 2020. https://doi.org/10.1007/978-981-15-9129-7_40
578
M. Zhang et al.
For simplicity, we consider synchronized algorithms for encrypted distributed training where a typical round consists of the following steps: 1. A subset of the participants is selected, and each participant downloads the global model from the server. 2. Each participant locally computes the gradient updates based on its dataset. 3. The encrypted gradient updates are computed and sent from the participants to the server 4. The server aggregates the encrypted updates and applies the updates to the global model using stoc
Data Loading...