The Vulnerability of Multiplicative Noise Protection to Correlation-Attacks on Continuous Microdata
- PDF / 488,187 Bytes
- 23 Pages / 439.37 x 666.142 pts Page_size
- 81 Downloads / 165 Views
The Vulnerability of Multiplicative Noise Protection to Correlation-Attacks on Continuous Microdata Yue Ma
and Yan-Xia Lin
University of Wollongong, Wollongong, Australia
Rathindra Sarathy Oklahoma State University, Stillwater, USA Abstract When multiplicative noises are used to protect values of a sensitive attribute in a microdata, it is frequently assumed that data intruders use the noisemultiplied value to estimate the corresponding unobservable original value of a target record. In this paper, we show that, data intruders could easily construct another estimate instead of using the noise-multiplied value to attack an original value. The new estimate, namely “correlation-attack” estimate, is obtained by exploiting the potentially high correlation between the noisemultiplied data and the original data. We provide a detailed comparison between the two estimates (noise-multiplied value and the correlation-attack estimate) by comparing the mean squared errors of the two underlying estimators, and we propose that data providers should always assess the disclosure risks from both estimators when generating noise-multiplied data. Correspondingly, we propose a disclosure risk measure which could be used by data providers for noise generating variable selection during data masking stage. A simulation study is provided to illustrate how the disclosure risk measure could help with noise generating variable selection for masking a set of original data. AMS (2000) subject classification. Primary 62A01; Secondary 62C15. Keywords and phrases. Data confidentiality, Noise multiplication masking, Continuous microdata, Disclosure risk, Attacking strategy
1 Introduction Microdata contains details of individuals or businesses across several attributes, such as personal income. The role of a data provider may involve collecting and releasing microdata to data users for analysis. When carrying out this role, the data provider needs to ensure that data users could obtain information about a population from released data, while private information of survey respondents are not revealed to the public. To satisfy both
2
Y. Ma et al.
requirements, the data provider may apply data perturbation techniques to produce a set of perturbed microdata, and release the perturbed microdata to the public. This paper considers the case of using multiplicative noises to perturb a set of original data, and the resulted noise-multiplied data is released to the public for analysis (Hwang, 1986; Evans, 1996; Kim and Winkler, 2003; Nayak et al., 2011; Sinha et al., 2011). Using multiplicative noises to perturb sensitive values has been advocated by many researchers because of its appealing features. Multiplicative noises provide uniform protections, in terms of the coefficient of variation of the noises, to all sensitive observations (Nayak et al., 2011). Multiplicative noises are also more suitable for economic modeling of income data in some situations (Kim and Winkler, 2003). The masking mechanism is easy to implement in practice and a balanced utilit
Data Loading...