An Effective and Efficient Heuristic Privacy Preservation Algorithm for Decremental Anonymization Datasets

(α, k)-Anonymity is a well-known anonymization model that is extended from k-Anonymity. It is proposed to address privacy violation issues in published datasets from using identity linkage attacks, attribute linkage attacks, and probability inference link

  • PDF / 684,919 Bytes
  • 14 Pages / 439.37 x 666.142 pts Page_size
  • 62 Downloads / 202 Views

DOWNLOAD

REPORT


Abstract. (a, k)-Anonymity is a well-known anonymization model that is extended from k-Anonymity. It is proposed to address privacy violation issues in published datasets from using identity linkage attacks, attribute linkage attacks, and probability inference linkage attacks. Unfortunately, (a, k)-Anonymity is generally sufficient to preserve the privacy data in datasets that are focused on performing one-time data publishing. Thus, if published datasets, e.g., decremental datasets, are dynamic, i.e., the data of them is always changed by using deletion methods and multiple time data publishing, then the privacy data of users is collected in these published datasets could be violated by using such an appropriate comparison data attacking. To rid this vulnerability of (a, k)-Anonymity, an effective decremental privacy preservation algorithm is available in existence. Although this algorithm can address privacy violation issues in published decremental datasets, it has a vital vulnerability that must be improved, i.e., it is highly complex in terms of transforming the data which is available datasets to satisfy the specific privacy preservation constraints. For this reason, a heuristic privacy preservation algorithm for publishing decremental datasets based on clustering techniques to be proposed in this work. With the proposed algorithm, aside from privacy preservation, the data utility and execution time are also maintained as much as possible. Furthermore, we show the experimental results which indicate that the proposed algorithm is highly effective and efficient. Keywords: Heuristic algorithm  Clustering algorithm Decremental datasets  Privacy preservation

 Anonymity model 

1 Introduction In recent decade years, there are several well-known privacy preservation models to be proposed such as k-Anonymity [1, 2], l-Diversity [3], t-Closeness [4], (a, k)-Anonymity [5], and (k, e)-Anonymous [10, 11]. Unfortunately, these privacy preservation models generally propose to address privacy violation issues in datasets that are focused on performing one-time data publishing. For this reason, they could be insufficient to address privacy violation issues in datasets which always change the data of them by using deletion methods and published multiple times. An example of the real-life scenarios of the data in published datasets are changed by using deletion © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 J. I.-Z. Chen et al. (Eds.): ICIPCN 2020, AISC 1200, pp. 244–257, 2021. https://doi.org/10.1007/978-3-030-51859-2_22

An Effective and Efficient Heuristic Privacy Preservation Algorithm

245

methods, this event occurred in hospital at California, i.e., it publishes all discharged patient data to researchers for every six months [6]. To address privacy violation issues, in [7], the authors suggest that before datasets are published, all explicit identifier values are collected in datasets to be removed. Moreover, the unique quasi-identifier values are generali