Abstract
Our study aims to develop an effective integrated machine learning (ML) scheme to predict vascular events and bleeding in patients with nonvalvular atrial fibrillation taking dabigatran and identify important risk factors. This study is a post-hoc analysis from the Randomized Evaluation of Long-Term Anticoagulant Therapy trial database. One traditional prediction method, logistic regression (LGR), and four ML techniques—naive Bayes, random forest (RF), classification and regression tree, and extreme gradient boosting (XGBoost)—were combined to construct our scheme. Area under the receiver operating characteristic curve (AUC) of RF (0.780) and XGBoost (0.717) was higher than that of LGR (0.674) in predicting vascular events. In predicting bleeding, AUC of RF (0.684) and XGBoost (0.618) showed higher values than those generated by LGR (0.605). Our integrated ML feature selection scheme based on the two convincing prediction techniques identified age, history of congestive heart failure and myocardial infarction, smoking, kidney function, and body mass index as major variables of vascular events; age, kidney function, smoking, bleeding history, concomitant use of specific drugs, and dabigatran dosage as major variables of bleeding. ML is an effective data analysis algorithm for solving complex medical data. Our results may provide preliminary direction for precision medicine.
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献