Overcoming barriers to data sharing with medical image generation: a comprehensive evaluation-Reference-Cited by-同舟云学术

Overcoming barriers to data sharing with medical image generation: a comprehensive evaluation

Published:2021-09-24 Issue:1 Volume:4 Page:
ISSN:2398-6352
Container-title:npj Digital Medicine
language:en
Short-container-title:npj Digit. Med.

Author:

DuMont Schütte August^ORCID,Hetzel Jürgen,Gatidis Sergios,Hepp Tobias,Dietz Benedikt^ORCID,Bauer Stefan^ORCID,Schwab Patrick^ORCID

Abstract

AbstractPrivacy concerns around sharing personally identifiable information are a major barrier to data sharing in medical research. In many cases, researchers have no interest in a particular individual’s information but rather aim to derive insights at the level of cohorts. Here, we utilise generative adversarial networks (GANs) to create medical imaging datasets consisting entirely of synthetic patient data. The synthetic images ideally have, in aggregate, similar statistical properties to those of a source dataset but do not contain sensitive personal information. We assess the quality of synthetic data generated by two GAN models for chest radiographs with 14 radiology findings and brain computed tomography (CT) scans with six types of intracranial haemorrhages. We measure the synthetic image quality by the performance difference of predictive models trained on either the synthetic or the real dataset. We find that synthetic data performance disproportionately benefits from a reduced number of classes. Our benchmark also indicates that at low numbers of samples per class, label overfitting effects start to dominate GAN training. We conducted a reader study in which trained radiologists discriminate between synthetic and real images. In accordance with our benchmark results, the classification accuracy of radiologists improves with an increasing resolution. Our study offers valuable guidelines and outlines practical conditions under which insights derived from synthetic images are similar to those that would have been derived from real data. Our results indicate that synthetic data sharing may be an attractive alternative to sharing real patient-level data in the right setting.

Publisher

Springer Science and Business Media LLC

Subject

Health Information Management,Health Informatics,Computer Science Applications,Medicine (miscellaneous)

Link

https://www.nature.com/articles/s41746-021-00507-3.pdf

Reference60 articles.

1. Lo, B. Sharing clinical trial data: maximizing benefits, minimizing risk. JAMA 313, 793–794 (2015).

2. Sanna, S. et al. Causal relationships among the gut microbiome, short-chain fatty acids and metabolic diseases. Nat. Genet. 51, 1 (2019).

3. Li, H. et al. Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set. npj Breast Cancer 2, 16012 (2016).

4. Sun, R. et al. A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-l1 immunotherapy: an imaging biomarker, retrospective multicohort study. Lancet Oncol. 19, 1180–1191 (2018).

5. Miller, K. et al. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nat. Neurosci. 19, 1523–1536 (2016).

Cited by 31 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Generating Synthetic Data for Medical Imaging;Radiology;2024-09-01

2. Synthetic Data in AI-Driven Earth Observation: an Insight Into the SD4EO Project;IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium;2024-07-07

3. Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging;2024 35th Irish Signals and Systems Conference (ISSC);2024-06-13

4. Is Automatic Tumor Segmentation on Whole-Body¹⁸F-FDG PET Images a Clinical Reality?;Journal of Nuclear Medicine;2024-06-06

5. PRIMIS: Privacy-preserving medical image sharing via deep sparsifying transform learning with obfuscation;Journal of Biomedical Informatics;2024-02