Is the generalizability of a developed artificial intelligence algorithm for COVID-19 on chest CT sufficient for clinical use? Results from the International Consortium for COVID-19 Imaging AI (ICOVAI)-Reference-Cited by-同舟云学术

Is the generalizability of a developed artificial intelligence algorithm for COVID-19 on chest CT sufficient for clinical use? Results from the International Consortium for COVID-19 Imaging AI (ICOVAI)

Published:2023-01-18 Issue:6 Volume:33 Page:4249-4258
ISSN:1432-1084
Container-title:European Radiology
language:en
Short-container-title:Eur Radiol

Author:

Topff Laurens^ORCID,Groot Lipman Kevin B. W.,Guffens Frederic,Wittenberg Rianne,Bartels-Rutten Annemarieke,van Veenendaal Gerben,Hess Mirco,Lamerigts Kay,Wakkie Joris,Ranschaert Erik,Trebeschi Stefano,Visser Jacob J.,Beets-Tan Regina G. H.,Guiot Julien,Snoeckx Annemiek,Kint Peter,Van Hoe Lieven,Quattrocchi Carlo Cosimo,Dieckens Dennis,Lounis Samir,Schulze Eric,Sjer Arnout Eric-bart,van Vucht Niels,Tielbeek Jeroen A.W.,Raat Frank,Eijspaart Daniël,Abbas Ausami,

Abstract

Abstract Objectives Only few published artificial intelligence (AI) studies for COVID-19 imaging have been externally validated. Assessing the generalizability of developed models is essential, especially when considering clinical implementation. We report the development of the International Consortium for COVID-19 Imaging AI (ICOVAI) model and perform independent external validation. Methods The ICOVAI model was developed using multicenter data (n = 1286 CT scans) to quantify disease extent and assess COVID-19 likelihood using the COVID-19 Reporting and Data System (CO-RADS). A ResUNet model was modified to automatically delineate lung contours and infectious lung opacities on CT scans, after which a random forest predicted the CO-RADS score. After internal testing, the model was externally validated on a multicenter dataset (n = 400) by independent researchers. CO-RADS classification performance was calculated using linearly weighted Cohen’s kappa and segmentation performance using Dice Similarity Coefficient (DSC). Results Regarding internal versus external testing, segmentation performance of lung contours was equally excellent (DSC = 0.97 vs. DSC = 0.97, p = 0.97). Lung opacities segmentation performance was adequate internally (DSC = 0.76), but significantly worse on external validation (DSC = 0.59, p < 0.0001). For CO-RADS classification, agreement with radiologists on the internal set was substantial (kappa = 0.78), but significantly lower on the external set (kappa = 0.62, p < 0.0001). Conclusion In this multicenter study, a model developed for CO-RADS score prediction and quantification of COVID-19 disease extent was found to have a significant reduction in performance on independent external validation versus internal testing. The limited reproducibility of the model restricted its potential for clinical use. The study demonstrates the importance of independent external validation of AI models. Key Points • The ICOVAI model for prediction of CO-RADS and quantification of disease extent on chest CT of COVID-19 patients was developed using a large sample of multicenter data. • There was substantial performance on internal testing; however, performance was significantly reduced on external validation, performed by independent researchers. The limited generalizability of the model restricts its potential for clinical use. • Results of AI models for COVID-19 imaging on internal tests may not generalize well to external data, demonstrating the importance of independent external validation.

Publisher

Springer Science and Business Media LLC

Subject

Radiology, Nuclear Medicine and imaging,General Medicine

Link

https://link.springer.com/content/pdf/10.1007/s00330-022-09303-3.pdf

Reference27 articles.

1. Shi F, Wang J, Shi J et al (2021) Review of artificial intelligence techniques in imaging data acquisition, segmentation, and diagnosis for COVID-19. IEEE Rev Biomed Eng 14:4–15. https://doi.org/10.1109/RBME.2020.2987975

2. Francone M, Iafrate F, Masci GM et al (2020) Chest CT score in COVID-19 patients: correlation with disease severity and short-term prognosis. Eur Radiol 30:6808–6817. https://doi.org/10.1007/s00330-020-07033-y

3. Yang R, Li X, Liu H et al (2020) Chest CT severity score: an imaging tool for assessing severe COVID-19. Radiol Cardiothorac Imaging 2:e200047. https://doi.org/10.1148/ryct.2020200047

4. Wang X, Hu X, Tan W et al (2021) Multicenter study of temporal changes and prognostic value of a CT visual severity score in hospitalized patients with coronavirus disease (COVID-19). AJR Am J Roentgenol 217:83–92. https://doi.org/10.2214/AJR.20.24044

5. Lanza E, Muglia R, Bolengo I et al (2020) Quantitative chest CT analysis in COVID-19 to predict the need for oxygenation support and intubation. Eur Radiol 30:6770–6778. https://doi.org/10.1007/s00330-020-07013-2

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Evolving and Novel Applications of Artificial Intelligence in Thoracic Imaging;Diagnostics;2024-07-08

2. Clinical, Cultural, Computational, and Regulatory Considerations to Deploy AI in Radiology: Perspectives of RSNA and MICCAI Experts;Radiology: Artificial Intelligence;2024-07-01

3. Advancing differential diagnosis: a comprehensive review of deep learning approaches for differentiating tuberculosis, pneumonia, and COVID-19;Multimedia Tools and Applications;2024-05-27

4. Magnetic resonance imaging based deep-learning model: a rapid, high-performance, automated tool for testicular volume measurements;Frontiers in Medicine;2023-09-19