A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge

Author:

Kim Hyun Joo1ORCID,Venkat S. Jayakumar2,Chang Hyoung Woo2ORCID,Cho Yang Hyun3ORCID,Lee Jee Yang2,Koo Kyunghee2

Affiliation:

1. Department of Anesthesiology and Pain Medicine, Anesthesia and Pain Research Institute, Severance Hospital, Yonsei University College of Medicine, Seoul 03722, Republic of Korea

2. Department of Thoracic and Cardiovascular Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Gyeonggi-do, Seongnam-si 13620, Republic of Korea

3. Department of Thoracic and Cardiovascular Surgery, Samsung Medical Center, Sungkyunkwan University College of Medicine, Seoul 06351, Republic of Korea

Abstract

Continuously acquired biosignals from patient monitors contain significant amounts of unusable data. During the development of a decision support system based on continuously acquired biosignals, we developed machine and deep learning algorithms to automatically classify the quality of ECG data. A total of 31,127 twenty-s ECG segments of 250 Hz were used as the training/validation dataset. Data quality was categorized into three classes: acceptable, unacceptable, and uncertain. In the training/validation dataset, 29,606 segments (95%) were in the acceptable class. Two one-step, three-class approaches and two two-step binary sequential approaches were developed using random forest (RF) and two-dimensional convolutional neural network (2D CNN) classifiers. Four approaches were tested on 9779 test samples from another hospital. On the test dataset, the two-step 2D CNN approach showed the best overall accuracy (0.85), and the one-step, three-class 2D CNN approach showed the worst overall accuracy (0.54). The most important parameter, precision in the acceptable class, was greater than 0.9 for all approaches, but recall in the acceptable class was better for the two-step approaches: one-step (0.77) vs. two-step RF (0.89) and one-step (0.51) vs. two-step 2D CNN (0.94) (p < 0.001 for both comparisons). For the ECG quality classification, where substantial data imbalance exists, the 2-step approaches showed more robust performance than the one-step approach. This algorithm can be used as a preprocessing step in artificial intelligence research using continuously acquired biosignals.

Funder

Ministry of Health and Welfare, Republic of Korea

Seoul National University Bundang Hospital Research Fund

Publisher

MDPI AG

Subject

Molecular Medicine,Biomedical Engineering,Biochemistry,Biomaterials,Bioengineering,Biotechnology

Reference31 articles.

1. Noise detection on ECG based on agglomerative clustering of morphological features;Rodrigues;Comput. Biol. Med.,2017

2. Main artifacts in electrocardiography;Ann. Noninvasive Electrocardiol.,2018

3. Clifford, G.D., Azuaje, F., and McSharry, P. (2006). Advanced Methods and Tools for ECG Data Analysis, Artech House.

4. Liu, C., Li, P., Zhao, L., Liu, F., and Wang, R. (2011, January 18–21). Real-time signal quality assessment for ECGs collected using mobile phones. Proceedings of the 2011 Computing in Cardiology, Hangzhou, China.

5. Signal quality indices and data fusion for determining clinical acceptability of electrocardiograms;Clifford;Physiol. Meas.,2012

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3