Efficient distributed privacy-preserving collaborative outlier detection

  • PDF / 1,250,169 Bytes
  • 12 Pages / 595.224 x 790.955 pts Page_size
  • 24 Downloads / 208 Views

DOWNLOAD

REPORT


Efficient distributed privacy-preserving collaborative outlier detection Zhaohui Wei1 · Qingqi Pei1 · Xuefeng Liu1 · Lichuan Ma1 Received: 8 September 2019 / Accepted: 11 March 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract As a common way to identify anomalous data, outlier detection is widely applicable for intrusions detection, adverse reactions analysis, financial fraud prevention, etc. The accuracy of outlier detection depends crucially on the number of data involved in the test, i.e., the more data participate in detection, the higher accuracy we get. For this reason, crossdataset collaborative outlier detection is introduced to conquer the lack of data in a single-dataset setting. However, privacy concerns seriously prevent the application of collaborative outlier detection, since most organization are unwilling to share their data with others directly in practice. In this paper, we present efficient protocols for privacy preserving collaborative outlier detection from arbitrarily partitioned data using Local Distance-based Outlier Factor (LDOF). Our protocols fall in the two-server model where data owners distribute their private data among two non-colluding servers who detect outlier on the joint data by secure two-party computation. In particular, we perform arithmetic operations which takes place inside LDOF on arithmetic circuits instead of boolean circuits, and perform sorting operations on boolean circuits. Such a design enables standard operations are performed with suitable circuits, and thus our scheme is more efficient. In addition, to further improve protocol efficiency, local sensitive hash (LSH) is utilized to filter out data which do not need secure computation to reduce the the amount of shared data. We implement our system in C++ on real data. The security analysis and experiments show the security and efficiency of the proposed scheme. Our protocols are more faster than the state of previous methods. Keywords Privacy-preserving · Outlier detection · Distributed data

1 Introduction Outlier detection has numerous useful applications in many areas such as telecom and credit card fraud, electronic This article belongs to the Topical Collection: Special Issue on Security and Privacy in Machine Learning Assisted P2P Networks Guest Editors: Hongwei Li, Rongxing Lu and Mohamed Mahmoud  Zhaohui Wei

[email protected] Qingqi Pei [email protected] Xuefeng Liu [email protected] Lichuan Ma [email protected] 1

School of Telecommunications Engineering, Xidian University, Xian, Shaanxi, 710071, China

commerce, loan approval, pharmaceutical research, weather prediction, financial applications and so on. For instance, financial companies and payment networks can combine transaction history, merchant data, and account holder information to prevent financial fraud effectively, while health data from different hospitals can be used to produce adverse reactions analysis. While this data can be locally analyzed, the results of local analysis may not provide co