Author:
Bridge Joshua,Meng Yanda,Zhu Wenyue,Fitzmaurice Thomas,McCann Caroline,Addison Cliff,Wang Manhui,Merritt Cristin,Franks Stu,Mackey Maria,Messenger Steve,Sun Renrong,Zhao Yitian,Zheng Yalin
Abstract
AbstractObjectivesTo develop and externally geographically validate a mixed-effects deep learning model to diagnose COVID-19 from computed tomography (CT) imaging following best practice guidelines and assess the strengths and weaknesses of deep learning COVID-19 diagnosis.DesignModel development and external validation with retrospectively collected data from two countries.SettingHospitals in Moscow, Russia, collected between March 1, 2020, and April 25, 2020. The China Consortium of Chest CT Image Investigation (CC-CCII) collected between January 25, 2020, and March 27, 2020.Participants1,110 and 796 patients with either COVID-19 or healthy CT volumes from Moscow, Russia, and China, respectively.Main outcome measuresWe developed a deep learning model with a novel mixed-effects layer to model the relationship between slices in CT imaging. The model was trained on a dataset from hospitals in Moscow, Russia, and externally geographically validated on a dataset from a consortium of Chinese hospitals. Model performance was evaluated in discriminative performance using the area under the receiver operating characteristic (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). In addition, calibration performance was assessed using calibration curves, and clinical benefit was assessed using decision curve analysis. Finally, the model’s decisions were assessed visually using saliency maps.ResultsExternal validation on the large Chinese dataset showed excellent performance with an AUROC of 0.936 (95%CI: 0.910, 0.961). Using a probability threshold of 0.5, the sensitivity, specificity, NPV, and PPV were 0.753 (0.647, 0.840), 0.909 (0.869, 0.940), 0.711 (0.606, 0.802), and 0.925 (0.888, 0.953), respectively.ConclusionsDeep learning can reduce stress on healthcare systems by automatically screening CT imaging for COVID-19. However, deep learning models must be robustly assessed using various performance measures and externally validated in each setting. In addition, best practice guidelines for developing and reporting predictive models are vital for the safe adoption of such models.StatementsThe authors do not own any of the patient data, and ethics approval was not needed. The lead author affirms that this manuscript is an honest, accurate, and transparent account of the study being reported, that no important aspects of the study have been omitted, and that any discrepancies from the study as planned (and, if relevant, registered) have been explained. Patients and the public were not involved in the study.FundingThis study was funded by EPSRC studentship (No. 2110275), EPSRC Impact Acceleration Account (IAA) funding, and Amazon Web Services.SummaryWhat is already known on this topicDeep learning can diagnose diseases from imaging data automaticallyMany studies using deep learning are of poor quality and fail to follow current best practice guidelines for the development and reporting of predictive modelsCurrent methods do not adequately model the relationship between slices in CT volumetric dataWhat this study addsA novel method to analyse volumetric imaging data composed of slices such as CT images using deep learningModel developed following current best-practice guidelines for the development and reporting of prediction models
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献