Anonymization: The imperfect science of using data while preserving privacy

Author:

Gadotti Andrea12ORCID,Rocher Luc12ORCID,Houssiau Florimond13,Creţu Ana-Maria14ORCID,de Montjoye Yves-Alexandre1ORCID

Affiliation:

1. Imperial College London, Exhibition Road, London SW7 2AZ, UK.

2. University of Oxford, Wellington Square, Oxford OX1 2JD, UK.

3. Alan Turing Institute, 96 Euston Road, London NW1 2DB, UK.

4. EPFL, CH-1015 Lausanne, Switzerland.

Abstract

Information about us, our actions, and our preferences is created at scale through surveys or scientific studies or as a result of our interaction with digital devices such as smartphones and fitness trackers. The ability to safely share and analyze such data is key for scientific and societal progress. Anonymization is considered by scientists and policy-makers as one of the main ways to share data while minimizing privacy risks. In this review, we offer a pragmatic perspective on the modern literature on privacy attacks and anonymization techniques. We discuss traditional de-identification techniques and their strong limitations in the age of big data. We then turn our attention to modern approaches to share anonymous aggregate data, such as data query systems, synthetic data, and differential privacy. We find that, although no perfect solution exists, applying modern techniques while auditing their guarantees against attacks is the best approach to safely use and share data today.

Publisher

American Association for the Advancement of Science (AAAS)

Reference266 articles.

1. Office for National Statistics How others use census data (2011); https://www.ons.gov.uk/census/2011census/2011censusbenefits/howothersusecensusdata.

2. Flowminder 2021 Haiti earthquake: Population movements estimated with mobile operator data from Digicel Haiti: Report from 27 August (2021); https://flowminder.org/resources/publications-reports/2021-haiti-earthquake-report-2-population-movements-estimated-with-mobile-operator-data-from-digicel-haiti-report-from-27-august.

3. A systematic review of worldwide causal and correlational evidence on digital media and democracy

4. European Medicines Agency Workshop report: Data anonymisation—A key enabler for clinical data sharing (2018); https://www.ema.europa.eu/en/documents/report/report-data-anonymisation-key-enabler-clinical-data-sharing_en.pdf.

5. I. V. Pasquetto B. Swire-Thompson M. A. Amazeen F. Benevenuto N. M. Brashier R. M. Bond L. C. Bozarth C. Budak U. K. H. Ecker L. K. Fazio E. Ferrara A. J. Flanagin A. Flammini D. Freelon N. Grinberg R. Hertwig K. H. Jamieson K. Joseph J. J. Jones R. K. Garrett D. Kreiss S. McGregor J. McNealy D. Margolin A. Marwick F. Menczer M. J. Metzger S. Nah S. Lewandowsky P Lorenz-Spreen P Ortellado G Pennycook E Porter D. G. Rand R. E. Robertson F. Tripodi S. Vosoughi C. Vargo O. Varol B. E. Weeks J. Wihbey T. J. Wood K.-C. Yang Tackling misinformation: What researchers could do with social media data. Harvard Kennedy School Misinformation Rev. (2020).

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3