A new privacy-preserving proximal support vector machine for classification of vertically partitioned data
- PDF / 1,491,905 Bytes
- 10 Pages / 595.276 x 790.866 pts Page_size
- 80 Downloads / 208 Views
ORIGINAL ARTICLE
A new privacy-preserving proximal support vector machine for classification of vertically partitioned data Li Sun • Wei-Song Mu • Biao Qi • Zhi-Jian Zhou
Received: 16 September 2013 / Accepted: 24 February 2014 Ó Springer-Verlag Berlin Heidelberg 2014
Abstract A new privacy-preserving proximal support vector machine (P3SVM) is formulated for classification of vertically partitioned data. Our classifier is based on the concept of global random reduced kernel which is composed of local reduced kernels. Each of them is computed using local reduced matrix with Gaussian perturbation, which is privately generated by only one of the parties, and never made public. This formulation leads to an extremely simple and fast privacy-preserving algorithm, for generating a linear or nonlinear classifier that merely requires the solution of a single system of linear equations. Comprehensive experiments are conducted on multiple publicly available benchmark datasets to evaluate the performance of the proposed algorithms and the results indicate that: (a) Our P3SVM achieves better performance than the recently proposed privacy-preserving SVM via random kernels in terms of both classification accuracy and
L. Sun B. Qi Z.-J. Zhou (&) College of Science, Applied Mathematics, China Agricultural University, Beijing, China e-mail: [email protected] L. Sun e-mail: [email protected] B. Qi e-mail: [email protected] W.-S. Mu College of Information and Electrical Engineering, China Agricultural University, Beijing, China e-mail: [email protected]
computational time. (b) A significant improvement of accuracy is attained by our P3SVM when compared to classifiers generated only using each party’s own data. (c) The generated classifier has comparable accuracy to an ordinary PSVM classifier trained on the entire dataset, without releasing any private data. Keywords Privacy preserving classification Global random reduced kernel Proximal support vector machine Vertically partitioned data
1 Introduction With the advent of the era of cloud computing and big data, it is a great convenience for data concentration and sharing for cooperation between different organizations, but also increases the risk of privacy disclosure. Along with the increasing sensitive information disclosure for commercial or some legal reasons, privacy preservation has aroused people’s great attention, and ‘‘360 prism door’’ recently also reflects the imperative of privacy preservation. Data mining technologies have been viewed as threats to these sensitive data with private information, and these privacy issues have led to research on privacy preserving data mining techniques. As early as in 1995, privacypreserving data mining became a special research topic in the first KDD international conference, and in 1999, Rakesh Agrawal made a theme speech about it in the KDD conference [1]. Ever since then, privacy-preserving data mining has become a hot research topic in recent years [2–9]. Classification is one of the important data mining tasks and has been
Data Loading...