BACKGROUND
Atrial fibrillation (AF) is an independent risk factor that increases the risk of stroke fivefold. The purpose of our study was to develop an AF predictive model by machine learning (ML) based on 3-year medical information without electrocardiograms in our database to identify AF risk among elderly patients.
OBJECTIVE
To examine ML approaches applied to electronic medical records in our clinical research database to identify AF risk among elderly patients.
METHODS
We developed a predictive model according to the Taipei Medical University clinical research database electronic medical records, including diagnostic codes, medications, and laboratory data. Classification of new-onset AF in designated 1-year interval (retrospectively passed on judicially), judged by sensitivity, specificity, F1 score, area under the receiver operating characteristic curve (AUROC), and area under the precision-recall curve (AUPRC).
RESULTS
A total of 2138 participants (1028 female [48.1%]; mean [SD] age 78.8 [6.8] years) with AF and 8552 random controls (after the matching process) without AF (4112 female [48.1%]; mean [SD] age 78.8 [6.8] years) were included in the model. The 1-year new-onset AF risk prediction model based on the random forest algorithm using medication and diagnostic information along with specific laboratory data attained an AUROC of 0.74 and an AUPRC of 0.89, while the specificity was 98.7%. Most ML models achieved an accuracy of approximately 0.8.
CONCLUSIONS
The findings of this study suggest that our ML-based model could offer acceptable discrimination in differentiating the risk of incident AF in the following year. A targeted screening approach for AF could result in a clinical choice with efficacy for prediction of the incident AF risk in elderly patients.