Global healthcare fairness: We should be sharing more, not less, data-Reference-Cited by-同舟云学术

Global healthcare fairness: We should be sharing more, not less, data

Published:2022-10-06 Issue:10 Volume:1 Page:e0000102
ISSN:2767-3170
Container-title:PLOS Digital Health
language:en
Short-container-title:PLOS Digit Health

Author:

Seastedt Kenneth P.^ORCID,Schwab Patrick,O’Brien Zach,Wakida Edith,Herrera Karen^ORCID,Marcelo Portia Grace F.^ORCID,Agha-Mir-Salim Louis^ORCID,Frigola Xavier Borrat^ORCID,Ndulue Emily Boardman^ORCID,Marcelo Alvin^ORCID,Celi Leo Anthony^ORCID

Abstract

The availability of large, deidentified health datasets has enabled significant innovation in using machine learning (ML) to better understand patients and their diseases. However, questions remain regarding the true privacy of this data, patient control over their data, and how we regulate data sharing in a way that that does not encumber progress or further potentiate biases for underrepresented populations. After reviewing the literature on potential reidentifications of patients in publicly available datasets, we argue that the cost—measured in terms of access to future medical innovations and clinical software—of slowing ML progress is too great to limit sharing data through large publicly available databases for concerns of imperfect data anonymization. This cost is especially great for developing countries where the barriers preventing inclusion in such databases will continue to rise, further excluding these populations and increasing existing biases that favor high-income countries. Preventing artificial intelligence’s progress towards precision medicine and sliding back to clinical practice dogma may pose a larger threat than concerns ofpotentialpatient reidentification within publicly available datasets. While the risk to patient privacy should be minimized, we believe this risk will never be zero, and society has to determine an acceptable risk threshold below which data sharing can occur—for the benefit of a global medical knowledge system.

Publisher

Public Library of Science (PLoS)

Reference60 articles.

1. COVID-19 Chest X-Ray Dataset Initiative. Available from: https://github.com/agchung/Figure1-COVID-chestxray-dataset. [cited Mar 2021].

2. Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, et al., editors. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. Proceedings of the AAAI Conference on Artificial Intelligence; 2019.

3. MIMIC-IV (version 1.0);A Johnson;PhysioNet,2021

4. MIMIC-III, a freely accessible critical care database.;AEW Johnson;Sci Data,2016

5. Imagine…(a common language for ICU data inquiry and analysis).;LJ Kaplan;Intensive Care Med,2020

Cited by 42 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Seeing the random forest through the decision trees. Supporting learning health systems from histopathology with machine learning models: Challenges and opportunities;Journal of Pathology Informatics;2024-12

2. Unbiasing fairness evaluation of radiology AI model;Meta-Radiology;2024-09

3. Privacy-Enhancing Technologies in Biomedical Data Science;Annual Review of Biomedical Data Science;2024-08-23

4. Toward the European Health Data Space: The IMPaCT-Data secure infrastructure for EHR-based precision medicine research;Journal of Biomedical Informatics;2024-08

5. F-Chain: personalized overall survival prediction based on incremental adaptive indicators and multi-source clinical records;Memetic Computing;2024-07-16