BACKGROUND
Duration of atrial fibrillation (AF) is an important factor in determining the treatment strategy for persistent AF (PeAF). Although there have been reports of machine learning (ML) models that predict duration based on electrocardiogram (ECG) findings, there have been no reports examining the impact on clinicians when such models are used as clinical decision support systems.
OBJECTIVE
Development of ML models to predict AF duration (≥1 year or <1year): to develop ML models by training from ECG characteristics, patient background, echocardiographic and laboratory findings. Questionnaire for cardiologists to assess clinical implementation of the ML model: to investigate changes in the answers of AF duration with and without support of the ML model.
METHODS
The study included 272 PeAF patients aged 20-90 years, with data obtained between 1 Jan 2015 and 31 Dec 2023. Of these, 189 patients were included in the study, excluding 83 who met the exclusion criteria, 145 patients were used as training data to build the ML model and 44 patients were used as test data to test the predictive ability of the ML model. Ten cardiologists (Group A) predicted whether simulated patients (44 patients) had AF of more than one year duration (Phase 1). Next, the same questionnaire was performed again after being provided with the predicted results of the ML model and assessed whether the answers had changed (Phase 2). Meanwhile, another 10 cardiologists (Group B) were given the same two-stage test as Group A, after which they were shown the percentage of correct answers in Group A.
RESULTS
The prediction result by the ML model on the test data provides 81.8% accuracy (72% sensitivity and 89% specificity). The percentage of correct answer in Group A was 63.9 ± 9.6% in Phase 1 and improved to 71.6 ± 9.3% in Phase 2 (p=0.01), but not overcome ML results. The mean percentage of answer that differed from the predictions of the ML model in Phase 2 was 24.3 ± 11.8%. The percentage of correct answer in Group B was 59.8±5.3% in Phase 1 and improved to 68.2±5.9% in Phase 2 (p<0.01), but not overcome ML results, too. The mean percentage of answer that differed from the predictions of the ML model in Phase 2 was 28.2±4.7%, not significantly different from Group A (p=0.46).
CONCLUSIONS
The ML model for prediction of the duration of AF was found to have a higher ability than the cardiologist's diagnosis. Support for the ML model improved the cardiologist's diagnosis ability, but did not exceed that of the ML model itself. Cardiologists' reactions to discrepancies between their perceptions and the results of the ML model were more likely to believe their own decision.
CLINICALTRIAL