Machine learning and statistical approaches for classification of risk of coronary artery disease using plasma cytokines

Author:

Saharan Seema SinghORCID,Nagar Pankaj,Creasy Kate Townsend,Stock Eveline O.,Feng James,Malloy Mary J.,Kane John P.

Abstract

Abstract Background As per the 2017 WHO fact sheet, Coronary Artery Disease (CAD) is the primary cause of death in the world, and accounts for 31% of total fatalities. The unprecedented 17.6 million deaths caused by CAD in 2016 underscores the urgent need to facilitate proactive and accelerated pre-emptive diagnosis. The innovative and emerging Machine Learning (ML) techniques can be leveraged to facilitate early detection of CAD which is a crucial factor in saving lives. The standard techniques like angiography, that provide reliable evidence are invasive and typically expensive and risky. In contrast, ML model generated diagnosis is non-invasive, fast, accurate and affordable. Therefore, ML algorithms can be used as a supplement or precursor to the conventional methods. This research demonstrates the implementation and comparative analysis of K Nearest Neighbor (k-NN) and Random Forest ML algorithms to achieve a targeted “At Risk” CAD classification using an emerging set of 35 cytokine biomarkers that are strongly indicative predictive variables that can be potential targets for therapy. To ensure better generalizability, mechanisms such as data balancing, repeated k-fold cross validation for hyperparameter tuning, were integrated within the models. To determine the separability efficacy of “At Risk” CAD versus Control achieved by the models, Area under Receiver Operating Characteristic (AUROC) metric is used which discriminates the classes by exhibiting tradeoff between the false positive and true positive rates. Results A total of 2 classifiers were developed, both built using 35 cytokine predictive features. The best AUROC score of .99 with a 95% Confidence Interval (CI) (.982,.999) was achieved by the Random Forest classifier using 35 cytokine biomarkers. The second-best AUROC score of .954 with a 95% Confidence Interval (.929,.979) was achieved by the k-NN model using 35 cytokines. A p-value of less than 7.481e-10 obtained by an independent t-test validated that Random Forest classifier was significantly better than the k-NN classifier with regards to the AUROC score. Presently, as large-scale efforts are gaining momentum to enable early, fast, reliable, affordable, and accessible detection of individuals at risk for CAD, the application of powerful ML algorithms can be leveraged as a supplement to conventional methods such as angiography. Early detection can be further improved by incorporating 65 novel and sensitive cytokine biomarkers. Investigation of the emerging role of cytokines in CAD can materially enhance the detection of risk and the discovery of mechanisms of disease that can lead to new therapeutic modalities.

Funder

U.S. Public Health Service

Publisher

Springer Science and Business Media LLC

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Genetics,Molecular Biology,Biochemistry

Reference22 articles.

1. “Cardiovascular Diseases (CVDs).” World Health Organization, World Health Organization. www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds). Accessed 1 June 2020.

2. Namara KM, et al. Cardiovascular Disease as a Leading Cause of Death: How Are Pharmacists Getting Involved? Integr Pharm Res Pract. 2019;8:1–11. https://doi.org/10.2147/iprp.s133088.

3. Hastie T, Tibshirani R, Friedman J. Springer Series in Statistics the Elements of Statistical Learning Data Mining, Inference, and Prediction Second Edition. 2017. https://web.stanford.edu/~hastie/ElemStatLearn/printings/ESLII_print12_toc.pdf. Accessed 1 June 2020.

4. Zhang J-M, An J. Cytokines, Inflammation, and Pain. Int Anesthesiol Clin. 2007;45(2):27–37. https://doi.org/10.1097/aia.0b013e318034194e.

5. Dinarello CA. Historical Insights into Cytokines. Eur J Immunol. 2007;37 Suppl 1(Suppl 1):S34–45 www.ncbi.nlm.nih.gov/pmc/articles/PMC3140102/. U.S. National Library of Medicine.

Cited by 10 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3