Abstract
The increasing availability and use of sensitive personal data raises a set of issues regarding the privacy of the individuals behind the data. These concerns become even more important when health data are processed, as are considered sensitive (according to most global regulations). Privacy Enhancing Technologies (PETs) attempt to protect the privacy of individuals whilst preserving the utility of data. One of the most popular technologies recently is Differential Privacy (DP), which was used for the 2020 U.S. Census. Another trend is to combine synthetic data generators with DP to create so-called private synthetic data generators. The objective is to preserve statistical properties as accurately as possible, while the generated data should be as different as possible compared to the original data regarding private features. While these technologies seem promising, there is a gap between academic research on DP and synthetic data and the practical application and evaluation of these techniques for real-world use cases. In this paper, we evaluate three different private synthetic data generators (MWEM, DP-CTGAN, and PATE-CTGAN) on their use-case-specific privacy and utility. For the use case, continuous heart rate measurements from different individuals are analyzed. This work shows that private synthetic data generators have tremendous advantages over traditional techniques, but also require in-depth analysis depending on the use case. Furthermore, it can be seen that each technology has different strengths, so there is no clear winner. However, DP-CTGAN often performs slightly better than the other technologies, so it can be recommended for a continuous medical data use case.
Funder
Fraunhofer Lighthouse Project
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference37 articles.
1. Privacy and Patient Involvement in e-Health Worldwide: An International Analysis;Beyerer;Proceedings of the 2020 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory,2021
2. Secondary Use and Analysis of Big Data Collected for Patient Care: Contribution from the IMIA Working Group on Data Mining and Big Data Analytics;Peek;Yearb. Med. Informatics,2017
3. k-Anonymity: A Model for Protecting Privacy;Sweeney;Int. J. Uncertain. Fuzziness-Knowl. Based Syst.,2002
4. Privacy-Preserving Data Publishing: A Survey of Recent Developments;Fung;ACM Comput. Surv.,2010
5. Alvim, M.S., Andrés, M.E., Chatzikokolakis, K., Degano, P., and Palamidessi, C. (2011). International Workshop on Formal Aspects in Security and Trust, Springer.
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献