An Anonymization Approach for Dynamic Dataset with Multiple Sensitive Attributes

In recent days, the personal health information is collected in different fields. When the data is shared for different reasons, it poses a major danger to the field of health care. A number of anonymization methods are implemented to maintain person’s pr

  • PDF / 407,349 Bytes
  • 9 Pages / 439.37 x 666.142 pts Page_size
  • 93 Downloads / 221 Views

DOWNLOAD

REPORT


Abstract In recent days, the personal health information is collected in different fields. When the data is shared for different reasons, it poses a major danger to the field of health care. A number of anonymization methods are implemented to maintain person’s privacy. The existing methods of anonymization support only single sensitive and low-dimensional data. In our recent experiment, a method of anonymization is expected in order to anonymize high-dimensional data with multiple sensitive attributes. In line with the concept of k-anonymity and l-diversity, it combines anatomization and improved slicing strategy. The experimental findings show that it is limited to the static discharge of information only. In dynamic situations, the current technique may produce poor quality or high data loss. Hence, in this proposed approach, an anonymization model is designed in such a way to anonymize continuously growing dataset while assuring high utility. Keywords Health care · Privacy preservation · Anonymization · Anatomization · Improved slicing · K-anonymity · L-diversity

1 Introduction Every organization publishes data that are collected from different users. When they publish the data, the personal information of an individual may be disclosed. This violates the privacy of the person and it needs to be protected from abuse [1]. To maintain privacy of the personal information, anonymization procedures are introduced. The dataset may be classified into different categories of attributes like identifiers, quasi-identifiers, sensitive attributes, and non-sensitive attributes [2]. The k-anonymity generalization [3, 4] and l-diversity bucketization [5] are the popular privacy preservation approaches. The k-anonymity method in generalization [6, 7] disregards an enormous portion of information in the occasion of high-dimensional data. But these approaches are suitable only for static data release and can handle V. Shyamala Susan (B) A.P.C. Mahalaxmi College for Women, Thoothukudi, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. S. Dash et al. (eds.), Intelligent Computing and Applications, Advances in Intelligent Systems and Computing 1172, https://doi.org/10.1007/978-981-15-5566-4_65

731

732

V. Shyamala Susan

single sensitive attribute. If all sensitive attributes are given a similar amount of privacy, they may not provide predictable outcomes on the accessible information. The real-world data contains Multiple Sensitive Attributes (MSA). To handle that situation, our recent work proposed a model to anonymize MSA. It presents a method of anonymization that integrates the assistances of anatomization and more enriched slicing according to the rule of k-anonymity and l-diversity to handle highdimensional information with MSA. The algorithm of anatomization divides QI and SA in the dataset and publishes unmodified QI and SA in two tables: QIT and ST. The vertical partitioning stage in the enhanced slicing algorithm combines the correlated SA in ST with the correlated QI attributes in QI