Affiliation:
1. Division of Cardiology, Department of Medicine University of California, San Francisco San Francisco CA USA
2. Department of Obstetrics, Division of Fetal Medicine Leiden University Medical Center Leiden The Netherlands
3. Department of Pediatrics, Division of Cardiology University of California, San Francisco San Francisco CA USA
4. Bakar Computational Health Sciences Institute; Department of Radiology; UCSF Berkeley Joint Program in Computational Precision Health; Center for Intelligent Imaging; Biological and Medical Informatics University of California, San Francisco San Francisco CA USA
Abstract
ABSTRACTObjectivesDespite nearly universal prenatal ultrasound screening programs, congenital heart defects (CHD) are still missed, which may result in severe morbidity or even death. Deep machine learning (DL) can automate image recognition from ultrasound. The main aim of this study was to assess the performance of a previously developed DL model, trained on images from a tertiary center, using fetal ultrasound images obtained during the second‐trimester standard anomaly scan in a low‐risk population. A secondary aim was to compare initial screening diagnosis, which made use of live imaging at the point‐of‐care, with diagnosis by clinicians evaluating only stored images.MethodsAll pregnancies with isolated severe CHD in the Northwestern region of The Netherlands between 2015 and 2016 with available stored images were evaluated, as well as a sample of normal fetuses' examinations from the same region and time period. We compared the accuracy of the initial clinical diagnosis (made in real time with access to live imaging) with that of the model (which had only stored imaging available) and with the performance of three blinded human experts who had access only to the stored images (like the model). We analyzed performance according to ultrasound study characteristics, such as duration and quality (scored independently by investigators), number of stored images and availability of screening views.ResultsA total of 42 normal fetuses and 66 cases of isolated CHD at birth were analyzed. Of the abnormal cases, 31 were missed and 35 were detected at the time of the clinical anatomy scan (sensitivity, 53%). Model sensitivity and specificity were 91% and 78%, respectively. Blinded human experts (n = 3) achieved mean ± SD sensitivity and specificity of 55 ± 10% (range, 47–67%) and 71 ± 13% (range, 57–83%), respectively. There was a statistically significant difference in model correctness according to expert‐graded image quality (P = 0.03). The abnormal cases included 19 lesions that the model had not encountered during its training; the model's performance in these cases (16/19 correct) was not statistically significantly different from that for previously encountered lesions (P = 0.41).ConclusionsA previously trained DL algorithm had higher sensitivity than initial clinical assessment in detecting CHD in a cohort in which over 50% of CHD cases were initially missed clinically. Notably, the DL algorithm performed well on community‐acquired images in a low‐risk population, including lesions to which it had not been exposed previously. Furthermore, when both the model and blinded human experts had access to only stored images and not the full range of images available to a clinician during a live scan, the model outperformed the human experts. Together, these findings support the proposition that use of DL models can improve prenatal detection of CHD. © 2023 International Society of Ultrasound in Obstetrics and Gynecology.
Funder
National Institutes of Health
U.S. Department of Defense
Subject
Obstetrics and Gynecology,Radiology, Nuclear Medicine and imaging,Reproductive Medicine,General Medicine,Radiological and Ultrasound Technology
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献