Validation of an artificial intelligence model for 12-lead ECG interpretation

Author:

Demolder A1,Herman R2,Vavrik B1,Martonak M1,Boza V1,Herman M1,Palus T1,Kresnakova V1,Bahyl J1,Iring A1,Nelis O3,Fabbricatore D3,Perl L4,Hatala R5,Bartunek J3

Affiliation:

1. Powerful Medical , Bratislava , Slovakia

2. University of Naples Federico II, Department of Advanced Biomedical Sciences , Naples , Italy

3. Cardiovascular Research Center Aalst , Aalst , Belgium

4. Rabin Medical Center, Department of Cardiology , Petah Tikva , Israel

5. National Institute of Cardiovascular Diseases, Department of Arrhythmia and Pacing , Bratislava , Slovakia

Abstract

Abstract Background The electrocardiogram (ECG) is one of the most accessible and comprehensive diagnostic tools to assess cardiac abnormalities. However, automated ECG interpretation remains inferior to physician interpretation in terms of accuracy and reliability. Purpose This study evaluated the accuracy of an AI-powered ECG model in providing a precise diagnosis of 12-lead ECGs and compared its diagnostic performance to primary care physicians and cardiologists through extensive benchmarking. Methods A deep neural network (DNN) was trained on standard 12-lead ECGs to detect 38 diagnoses (grouped into 6 categories: rhythm, conduction abnormalities, chamber enlargement, infarction, ectopy, and axis), denoting the most common types of electrocardiographic abnormalities. Performance of AI-powered ECG diagnosis was evaluated on an independent test set annotated by consensus of two expert cardiologists. Benchmarking was performed against three individual primary care physicians and six individual cardiologists who independently annotated the same ECG test set. The key metrics used to compare performances were positive predictive value (PPV), negative predictive value (NPV), Sensitivity, Specificity, and F1 score. Results A total of 931,344 standard 12-lead ECGs from 172,750 patients were used to train a DNN. The independent test set had 11,932 annotated ECG labels. The model attained an overall mean F1 score of 0.921, sensitivity 0.910 (0.889–0.931), specificity 0.968 (0.954–0.981), PPV 0.939 (0.919–0.958), and NPV 0.965 (0.951–0.979) [Figure 1]. In all 6 diagnostic categories, the DNN achieved higher mean F1 scores than the mean cardiologist and primary care physician (Rhythm 0.951 vs. 0.892 vs. 0.734; Conduction abnormalities 0.883 vs. 0.824 vs. 0.693; Chamber enlargement 0.970 vs. 0.761 vs. 0.562; Infarction 0.918 vs. 0.853 vs. 0.781; Ectopy 0.966 vs. 0.951 vs. 0.897; Axis 0.909 vs. 0.644 vs. 0.528, respectively). The ability of the DNN to identify atrial fibrillation achieved nearly perfect performance (PPV of 0.989 and NPV of 0.990). Diagnostic performance surpassed primary care physicians and was non-inferior to cardiologists based on the F1 scores for all individual diagnoses. Conclusions Our results demonstrate the AI-powered ECG model’s ability to accurately identify electrocardiographic abnormalities from the 12-lead ECG, showcasing its utility as clinical tool for healthcare professionals.

Publisher

Oxford University Press (OUP)

Subject

Cardiology and Cardiovascular Medicine

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3