Local Differential Privacy for Data Streams

The dynamic change, huge data size, and complex structure of the data stream have made it very difficult to be analyzed and protected in real-time. Traditional privacy protection models such as differential privacy which need to rely on the trusted server

  • PDF / 1,152,420 Bytes
  • 18 Pages / 439.37 x 666.142 pts Page_size
  • 51 Downloads / 213 Views

DOWNLOAD

REPORT


Abstract. The dynamic change, huge data size, and complex structure of the data stream have made it very difficult to be analyzed and protected in real-time. Traditional privacy protection models such as differential privacy which need to rely on the trusted servers or companies, and this will increase the uncertainty of protecting streaming privacy. In this paper, we propose a new privacy protection protocol for data streams under local differential privacy and w-event privacy, which makes it possible to keep up-to-date statistics over time, and it is still available when the third parties are untrusted. We use sliding window to collect the data streams in real-time, finding out the occurrence of significant moves, capturing the latest data distribution trend, and releasing the perturbed data streams report in time. This protocol provides a provable privacy guarantee, reduces computation and storage costs, and provides valuable statistical information. The experimental results of real datasets show that the proposed method can protect the privacy of the data streams and provide available statistical data at the same time.

Keywords: Data streams privacy · Sliding window

1

· Local differential privacy · w-event

Introduction

With the development of 5G technology, intelligent devices and sensors have produced more and more dynamic data, which we call the data stream. Realtime analysis of stream data can obtain valuable information to understand an important phenomenon [13], so it is widely used in various application fields, such as mobile crowd sensing [28], traffic service stream monitoring [19] and social network hotspot tracking [26]. The data service providers collect real-time data stream and publish real-time statistics, share and analyze [29] them with interested third-party to improve the service quality. This work was supported by National Natural Science Foundation of China (61572034), Major Science and Technology Projects in Anhui Province (18030901025), Anhui Province University Natural Science Fund (KJ2019A0109). c Springer Nature Singapore Pte Ltd. 2020  S. Yu et al. (Eds.): SPDE 2020, CCIS 1268, pp. 143–160, 2020. https://doi.org/10.1007/978-981-15-9129-7_11

144

X. Fang et al.

However, there are potential privacy risks in this process. On account of the joining of the untrusted third party, the attacker may query the original data of multiple timestamps of a single user through the differential attack to draw the user’s data track and disclose the user’s privacy information [20]. Recently research [4] has found that the user’s mobile trajectory is highly unique from the user’s mobile data obtained by mobile phone operators. Even if the desensitized dataset provides a small amount of anonymous information, it can still be linked to the designated user with relevant background knowledge. A series of similar findings reveal that the privacy of personal data stream is facing a huge risk, so it is of great significance to the research and development of data stream privacy collection and release mechanism, but