Application of a validated prostate MRI deep learning system to independent same-vendor multi-institutional data: demonstration of transferability

Author:

Netzer NilsORCID,Eith Carolin,Bethge Oliver,Hielscher Thomas,Schwab Constantin,Stenzinger Albrecht,Gnirs Regula,Schlemmer Heinz-Peter,Maier-Hein Klaus H.,Schimmöller Lars,Bonekamp DavidORCID

Abstract

Abstract Objectives To evaluate a fully automatic deep learning system to detect and segment clinically significant prostate cancer (csPCa) on same-vendor prostate MRI from two different institutions not contributing to training of the system. Materials and methods In this retrospective study, a previously bi-institutionally validated deep learning system (UNETM) was applied to bi-parametric prostate MRI data from one external institution (A), a PI-RADS distribution-matched internal cohort (B), and a csPCa stratified subset of single-institution external public challenge data (C). csPCa was defined as ISUP Grade Group ≥ 2 determined from combined targeted and extended systematic MRI/transrectal US-fusion biopsy. Performance of UNETM was evaluated by comparing ROC AUC and specificity at typical PI-RADS sensitivity levels. Lesion-level analysis between UNETM segmentations and radiologist-delineated segmentations was performed using Dice coefficient, free-response operating characteristic (FROC), and weighted alternative (waFROC). The influence of using different diffusion sequences was analyzed in cohort A. Results In 250/250/140 exams in cohorts A/B/C, differences in ROC AUC were insignificant with 0.80 (95% CI: 0.74–0.85)/0.87 (95% CI: 0.83–0.92)/0.82 (95% CI: 0.75–0.89). At sensitivities of 95% and 90%, UNETM achieved specificity of 30%/50% in A, 44%/71% in B, and 43%/49% in C, respectively. Dice coefficient of UNETM and radiologist-delineated lesions was 0.36 in A and 0.49 in B. The waFROC AUC was 0.67 (95% CI: 0.60–0.83) in A and 0.7 (95% CI: 0.64–0.78) in B. UNETM performed marginally better on readout-segmented than on single-shot echo-planar-imaging. Conclusion For same-vendor examinations, deep learning provided comparable discrimination of csPCa and non-csPCa lesions and examinations between local and two independent external data sets, demonstrating the applicability of the system to institutions not participating in model training. Clinical relevance statement A previously bi-institutionally validated fully automatic deep learning system maintained acceptable exam-level diagnostic performance in two independent external data sets, indicating the potential of deploying AI models without retraining or fine-tuning, and corroborating evidence that AI models extract a substantial amount of transferable domain knowledge about MRI-based prostate cancer assessment. Key Points • A previously bi-institutionally validated fully automatic deep learning system maintained acceptable exam-level diagnostic performance in two independent external data sets. • Lesion detection performance and segmentation congruence was similar on the institutional and an external data set, as measured by the weighted alternative FROC AUC and Dice coefficient. • Although the system generalized to two external institutions without re-training, achieving expected sensitivity and specificity levels using the deep learning system requires probability thresholds to be adjusted, underlining the importance of institution-specific calibration and quality control.

Funder

Deutsches Krebsforschungszentrum (DKFZ)

Publisher

Springer Science and Business Media LLC

Subject

Radiology, Nuclear Medicine and imaging,General Medicine

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3