Differentially private release of medical microdata: an efficient and practical approach for preserving informative attribute values-Reference-Cited by-同舟云学术

Differentially private release of medical microdata: an efficient and practical approach for preserving informative attribute values

Published:2020-07-08 Issue:1 Volume:20 Page:
ISSN:1472-6947
Container-title:BMC Medical Informatics and Decision Making
language:en
Short-container-title:BMC Med Inform Decis Mak

Author:

Lee Hyukki,Chung Yon Dohn^ORCID

Abstract

Abstract Background Various methods based on k-anonymity have been proposed for publishing medical data while preserving privacy. However, the k-anonymity property assumes that adversaries possess fixed background knowledge. Although differential privacy overcomes this limitation, it is specialized for aggregated results. Thus, it is difficult to obtain high-quality microdata. To address this issue, we propose a differentially private medical microdata release method featuring high utility. Methods We propose a method of anonymizing medical data under differential privacy. To improve data utility, especially by preserving informative attribute values, the proposed method adopts three data perturbation approaches: (1) generalization, (2) suppression, and (3) insertion. The proposed method produces an anonymized dataset that is nearly optimal with regard to utility, while preserving privacy. Results The proposed method achieves lower information loss than existing methods. Based on a real-world case study, we prove that the results of data analyses using the original dataset and those obtained using a dataset anonymized via the proposed method are considerably similar. Conclusions We propose a novel differentially private anonymization method that preserves informative values for the release of medical data. Through experiments, we show that the utility of medical data that has been anonymized via the proposed method is significantly better than that of existing methods.

Publisher

Springer Science and Business Media LLC

Subject

Health Informatics,Health Policy,Computer Science Applications

Link

https://link.springer.com/content/pdf/10.1186/s12911-020-01171-5.pdf

Reference24 articles.

1. Ren J-J, Sun T, He Y, Zhang Y. A statistical analysis of vaccine-adverse event data. BMC Med Inform Decis Mak. 2019; 19(1):101.

2. Jing X, Emerson M, Masters D, Brooks M, Buskirk J, Abukamail N, Liu C, Cimino JJ, Shubrook J, De Lacalle S, et al. A visual interactive analytic tool for filtering and summarizing large health data sets coded with hierarchical terminologies (VIADS). BMC Med Inform Decis Mak. 2019; 19(1):31.

3. Sweeney L. Int J Uncertain, Fuzziness Knowl-Based Syst. 2002; 10(05):557–70.

4. Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. l-diversity: Privacy beyond k-anonymity. ACM Trans Knowl Discov Data (TKDD). 2007; 1(1):3.

5. Li N, Li T, Venkatasubramanian S. t-closeness: Privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering. IEEE Computer Society: 2007. p. 106–15.

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Differentially private and explainable boosting machine with enhanced utility;Neurocomputing;2024-11

2. Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique;Egyptian Informatics Journal;2024-09

3. Deep Learning for Credit Card Fraud Detection: A Review of Algorithms, Challenges, and Solutions;IEEE Access;2024

4. Anonymizing Periodical Releases of SRS Data by Fusing Differential Privacy;2022 IEEE International Conference on Big Data (Big Data);2022-12-17

5. Public comprehension of privacy protections applied to health data shared for research: An Australian cross-sectional study;International Journal of Medical Informatics;2022-11