A differentially private algorithm for range queries on trajectories

  • PDF / 2,427,136 Bytes
  • 27 Pages / 439.37 x 666.142 pts Page_size
  • 55 Downloads / 205 Views

DOWNLOAD

REPORT


A differentially private algorithm for range queries on trajectories Soheila Ghane1

· Lars Kulik1 · Kotagiri Ramamoharao1

Received: 29 January 2019 / Revised: 28 September 2020 / Accepted: 4 October 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract We propose a novel algorithm to ensure -differential privacy for answering range queries on trajectory data. In order to guarantee privacy, differential privacy mechanisms add noise to either data or query, thus introducing errors to queries made and potentially decreasing the utility of information. In contrast to the state of the art, our method achieves significantly lower error as it is the first data- and query-aware approach for such queries. The key challenge for answering range queries on trajectory data privately is to ensure an accurate count. Simply representing a trajectory as a set instead of sequence of points will generally lead to highly inaccurate query answers as it ignores the sequential dependency of location points in trajectories, i.e., will violate the consistency of trajectory data. Furthermore, trajectories are generally unevenly distributed across a city and adding noise uniformly will generally lead to a poor utility. To achieve differential privacy, our algorithm adaptively adds noise to the input data according to the given query set. It first privately partitions the data space into uniform regions and computes the traffic density of each region. The regions and their densities, in addition to the given query set, are then used to estimate the distribution of trajectories over the queried space, which ensures high accuracy for the given query set. We show the accuracy and efficiency of our algorithm using extensive empirical evaluations on real and synthetic data sets. Keywords Spatial histogram · Trajectory · Range query · Differential privacy

1 Introduction The popularity of sensor-enabled devices (e.g., wearables and smartphones) has significantly advanced the capability of businesses to collect and analyze people’s trajectories. Studying large-scale data sets and analyzing the movement patterns of individuals provide crucial insight for many applications (e.g., traffic management, route planning, urban planning, crime detection). Such applications critically rely on estimating the number of trajectories in an area. For example, in urban planning, computing the number of pedestrians and the flow of their movements provide significant information for the placement of public spaces

B 1

Soheila Ghane [email protected] Department of Computing and Information Systems, University of Melbourne, Melbourne, Australia

123

S. Ghane et al.

(e.g., local parks, street spaces, plazas), pedestrian and bicycle paths, and public transport stations. A fundamental query type in studying the movement patterns is range query on trajectories [12,14] which counts the number of distinct trajectories intersecting the query area (2D space). However, computing the cost of this query type on raw trajectories is li