Critical Analysis of Privacy Risks in Machine Learning and Implications for Use of Health Data: A systematic review and meta-analysis on membership inference attacks

Author:

Walker Emily V.1,Bu Jingyu1,Pakseresht Mohammadreza1,Wickham Maeve1,Shack Lorraine1,Robson Paula1,Hegde Nidhi2

Affiliation:

1. Alberta Health Services

2. University of Alberta

Abstract

Abstract Purpose. Machine learning(ML) has revolutionized data processing and analysis, with applications in health showing great promise. However, ML poses privacy risks, as models may reveal information about their training data. Developing frameworks to assess/mitigate privacy risks is essential, particularly for health data custodians responsible for adhering to ethical and legal standards in data use. In September 2022, we conducted a systematic review/meta-analysis to estimate the relative effects of factors hypothesized to contribute to ML privacy risk, focusing on membership inference attacks (MIA). Methods. Papers were screened for relevance to MIA, and selected for the meta-analysis if they contained attack performance(AP) metrics for attacks on models trained on numeric data. Random effects regression was used to estimate the adjusted average change in AP by model type, generalization gap and the density of training data in each region of input space (partitioned density). Residual sum of squares was used to determine the importance of variables on AP. Results. The systematic review and meta-analysis included 115 and 42 papers, respectively, comprising 1,910 experiments. The average AP ranged from 61.0% (95%CI:60.0%-63.0%; AUC)-74.0% (95%CI:72.0%-76.0%; recall). Higher partitioned density was inversely associated with AP for all model architectures, with the largest effect on decision trees. Higher generalization gap was linked to increased AP, predominantly affecting neural networks. Partitioned density was a better predictor of AP than generalization gap for most architectures. Conclusions. This is the first quantitative synthesis of MIA experiments, that highlights the effect of dataset composition on AP, particularly on decision trees, which are commonly used in health.

Publisher

Research Square Platform LLC

Reference120 articles.

1. Algorithms that remember: model inversion attacks and data protection law;Veale M;Philos Trans R Soc Math Phys Eng Sci,2018

2. Demystifying Membership Inference Attacks in Machine Learning as a Service;Truex S;IEEE Trans Serv Comput,2021

3. Shokri R, Stronati M, Song C, Shmatikov V (2021) Membership Inference Attacks against Machine Learning Models. ArXiv161005820 Cs Stat. Published online March 31, 2017. Accessed December 21, http://arxiv.org/abs/1610.05820

4. Fredrikson M, Jha S, Ristenpart T (2015) Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM; :1322–1333. 10.1145/2810103.2813677

5. Leino K, Fredrikson M Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference.:19

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3