Differential privacy distributed learning under chaotic quantum particle swarm optimization

  • PDF / 790,104 Bytes
  • 24 Pages / 439.37 x 666.142 pts Page_size
  • 69 Downloads / 195 Views

DOWNLOAD

REPORT


Differential privacy distributed learning under chaotic quantum particle swarm optimization Yun Xie1,2

· Peng Li1,3 · Jindan Zhang4 · Marek R. Ogiela5

Received: 15 September 2020 / Accepted: 12 October 2020 © Springer-Verlag GmbH Austria, part of Springer Nature 2020

Abstract Differential privacy has been a common framework that provides an effective method of establishing privacy-guaranteed machine learning. Extensive research work has focused on differential privacy stochastic gradient descent (SGD-DP) and its variants under distributed machine learning to improve training efficiency and protect privacy. However, SGD-DP relies on the premise of convex optimization. In large-scale distributed machine learning, the objective function may be more a non-convex objective function, which not only makes the gradient calculation difficult and easy to fall into local optimization. It’s difficult to achieve truly global optimization. To address this issue, we propose a novel differential privacy optimization algorithm based on quantum particle swarm theory that suitable for both convex optimization and non-convex optimization. We further comprehensively apply adaptive contraction–expansion and chaotic search to overcome the premature problem, and provide theoretical analysis in terms of convergence and privacy protection. Also, we verify through experiments that the actual application performance of the algorithm is consistent with the theoretical analysis. Keywords Distributed machine learning · Differential privacy · Chaotic search · Quantum particle swarm optimization Mathematics Subject Classification 68T20

B

Peng Li [email protected]

1

College of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

2

Zijin College, Nanjing University of Science and Technology, Nanjing 210023, China

3

Jiangsu High Technology Research Key Laboratory for Wireless, Sensor Networks, Nanjing 210003, China

4

Xianyang Vocational Technical College, Xianyang 712000, China

5

AGH University of Science and Technology, 30 Mickiewicza Ave, 30-059 Kraków, Poland

123

Y. Xie et al.

1 Introduction Machine learning has been widely used in various fields and has achieved remarkable results. Big data provides a guarantee for the accuracy of the training model for machine learning, but it also puts forward higher requirements on the computing power and storage capacity of local working nodes, especially for machine learning with complex models. At present, it is difficult to effectively rely solely on local resources to complete the current machine learning tasks, whether in terms of training speed or training accuracy. To solve this bottleneck problem, parallel and distributed machine learning technologies have attracted more and more attention from scholars in recent years [1–3]. The large data set is divided and distributed to multiple nodes that work locally for training, which is called data parallelism [4,5]. Distributed learning systems have demonstrated better scalability in the face of problems su