Application of a validated prostate MRI deep learning system to independent same-vendor multi-institutional data: demonstration of transferability-Reference-Cited by-同舟云学术

Application of a validated prostate MRI deep learning system to independent same-vendor multi-institutional data: demonstration of transferability

Published:2023-07-28 Issue:11 Volume:33 Page:7463-7476
ISSN:1432-1084
Container-title:European Radiology
language:en
Short-container-title:Eur Radiol

Author:

Netzer Nils^ORCID,Eith Carolin,Bethge Oliver,Hielscher Thomas,Schwab Constantin,Stenzinger Albrecht,Gnirs Regula,Schlemmer Heinz-Peter,Maier-Hein Klaus H.,Schimmöller Lars,Bonekamp David^ORCID

Abstract

Abstract Objectives To evaluate a fully automatic deep learning system to detect and segment clinically significant prostate cancer (csPCa) on same-vendor prostate MRI from two different institutions not contributing to training of the system. Materials and methods In this retrospective study, a previously bi-institutionally validated deep learning system (UNETM) was applied to bi-parametric prostate MRI data from one external institution (A), a PI-RADS distribution-matched internal cohort (B), and a csPCa stratified subset of single-institution external public challenge data (C). csPCa was defined as ISUP Grade Group ≥ 2 determined from combined targeted and extended systematic MRI/transrectal US-fusion biopsy. Performance of UNETM was evaluated by comparing ROC AUC and specificity at typical PI-RADS sensitivity levels. Lesion-level analysis between UNETM segmentations and radiologist-delineated segmentations was performed using Dice coefficient, free-response operating characteristic (FROC), and weighted alternative (waFROC). The influence of using different diffusion sequences was analyzed in cohort A. Results In 250/250/140 exams in cohorts A/B/C, differences in ROC AUC were insignificant with 0.80 (95% CI: 0.74–0.85)/0.87 (95% CI: 0.83–0.92)/0.82 (95% CI: 0.75–0.89). At sensitivities of 95% and 90%, UNETM achieved specificity of 30%/50% in A, 44%/71% in B, and 43%/49% in C, respectively. Dice coefficient of UNETM and radiologist-delineated lesions was 0.36 in A and 0.49 in B. The waFROC AUC was 0.67 (95% CI: 0.60–0.83) in A and 0.7 (95% CI: 0.64–0.78) in B. UNETM performed marginally better on readout-segmented than on single-shot echo-planar-imaging. Conclusion For same-vendor examinations, deep learning provided comparable discrimination of csPCa and non-csPCa lesions and examinations between local and two independent external data sets, demonstrating the applicability of the system to institutions not participating in model training. Clinical relevance statement A previously bi-institutionally validated fully automatic deep learning system maintained acceptable exam-level diagnostic performance in two independent external data sets, indicating the potential of deploying AI models without retraining or fine-tuning, and corroborating evidence that AI models extract a substantial amount of transferable domain knowledge about MRI-based prostate cancer assessment. Key Points • A previously bi-institutionally validated fully automatic deep learning system maintained acceptable exam-level diagnostic performance in two independent external data sets. • Lesion detection performance and segmentation congruence was similar on the institutional and an external data set, as measured by the weighted alternative FROC AUC and Dice coefficient. • Although the system generalized to two external institutions without re-training, achieving expected sensitivity and specificity levels using the deep learning system requires probability thresholds to be adjusted, underlining the importance of institution-specific calibration and quality control.

Funder

Deutsches Krebsforschungszentrum (DKFZ)

Publisher

Springer Science and Business Media LLC

Subject

Radiology, Nuclear Medicine and imaging,General Medicine

Link

https://link.springer.com/content/pdf/10.1007/s00330-023-09882-9.pdf

Reference41 articles.

1. Turkbey B, Rosenkrantz AB, Haider MA et al (2019) Prostate Imaging Reporting and Data System Version 2.1: 2019 Update of Prostate Imaging Reporting and Data System Version 2. Eur Urol 76:340–351

2. Schelb P, Kohl S, Radtke JP et al (2019) Classification of cancer at prostate MRI: deep learning versus clinical PI-RADS assessment. Radiology 293:607–617

3. Schelb P, Wang X, Radtke JP et al (2020) Simulated clinical deployment of fully automatic deep learning for clinical prostate MRI assessment. Eur Radiol. https://doi.org/10.1007/s00330-020-07086-z