Participation in patient support forums may put rare disease patient data at risk of re-identification

  • PDF / 1,047,698 Bytes
  • 12 Pages / 595 x 791 pts Page_size
  • 13 Downloads / 169 Views

DOWNLOAD

REPORT


(2020) 15:226

RESEARCH

Open Access

Participation in patient support forums may put rare disease patient data at risk of re-identification James Gow1,3*

, Colin Moffatt2,3 and Jamie Blackport3

Abstract Background: Rare disease patients often struggle to find both medical advice and emotional support for their diagnosis. Consequently, many rare disease patient support forums have appeared on hospital webpages, social media sites, and on rare disease foundation sites. However, we argue that engagement in these groups may pose a healthcare data privacy threat to many participants, since it makes a series of patient indirect identifiers ‘readily available’ in combination with rare disease conditions. This information produces a risk of re-identification because it may allow a motivated attacker to use the unique combination of a patient’s identifiers and disease condition to re-identify them in anonymized data. Results: To assess this risk of re-identification, patient direct and indirect identifiers were mined from patient support forums for 80 patients across eight rare diseases. This data mining consisted of scanning patient testimonials, social media sites, and public records for the collection of identifiers linked to a rare disease patient. The number of people in the United States that may share each patient’s combination of marital status, 3-digit ZIP code, age, and sex, as well as their rare disease condition, was then estimated, as such information is commonly found in health records which have undergone de-identification by HIPAA’s ‘Safe Harbor.’ The study showed that by these estimations, nearly 75% of patients could be at high risk for re-identification in healthcare datasets in which they appear, due to their unique combination of identifiers. Conclusions: The results of this study show that these rare disease patients, due to their choice to provide support for their community, are putting all their healthcare data at risk of re-identification. This paper demonstrates how simple adjustments to participation guidelines in such support forums, in combination with improved privacy measures at the organizational level, could mitigate this risk of re-identification. Additionally, this paper suggests the potential for future investigation into consideration of certain ‘risky’ International Classification of Diseases (ICD) codes as quasi-identifiers in de-identified datasets to further protect patients’ privacy, while maintaining the utility of such rare disease support groups. Keywords: HIPAA, Healthcare, Privacy, Rare disease, Support forums

*Correspondence: [email protected]; [email protected] 1 Dartmouth College, Hanover 03755, New Hampshire, USA 3 Mirador Analytics Ltd. Priorwood House, Melrose TD6 9EF, UK Full list of author information is available at the end of the article © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium