A comparative evaluation of machine learning ensemble approaches for disease prediction using multiple datasets-Reference-Cited by-同舟云学术

A comparative evaluation of machine learning ensemble approaches for disease prediction using multiple datasets

Published:2024-03-27 Issue:3 Volume:14 Page:597-613
ISSN:2190-7188
Container-title:Health and Technology
language:en
Short-container-title:Health Technol.

Author:

Mahajan Palak,Uddin Shahadat^ORCID,Hajati Farshid,Moni Mohammad Ali,Gide Ergun

Abstract

Abstract Purpose Machine learning models are used to develop and improve various disease prediction systems. Ensemble learning is a machine learning technique that combines many classifiers to increase performance by making more accurate predictions than a single classifier. Although several researchers have employed ensemble techniques for disease prediction, a comprehensive comparative study of these techniques still needs to be provided. Methods Using 16 disease datasets from Kaggle and the UCI Machine Learning Repository, this study compares the performance of 15 variants of ensemble techniques for disease prediction. The comparison was performed using six performance measures: accuracy, precision, recall, F1 score, AUC (Area Under the receiver operating characteristics Curve) and AUPRC (Area Under the Precision-Recall Curve). Results Stacking variant of Multi-level stacking showed superior disease prediction performance compared with other bagging and boosting variants, followed by another stacking variant (Classical stacking). Overall, stacking outperformed bagging and boosting for disease prediction. Logit Boost showed the worst performance. Conclusion The findings of this study can help researchers select an appropriate ensemble approach for future studies focusing on accurate disease prediction.

Funder

University of Sydney

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s12553-024-00835-w.pdf

Reference48 articles.

1. Mienye ID, Sun Y. A survey of ensemble learning: concepts, algorithms, applications, and prospects. IEEE Access. 2022;10:99129–49.

2. Ramesh D, Katheria YS. Ensemble method based predictive model for analyzing disease datasets: a predictive analysis approach. Health Technol. 2019;9:533–45.

3. Lu H, Uddin S. Embedding-based link predictions to explore latent comorbidity of chronic diseases. Health Inform Sci Syst. 2022;11(1):2.

4. Uddin S, Wang S, Lu H, Khan A, Hajati F, Khushi M. Comorbidity and multimorbidity prediction of major chronic diseases using machine learning and network analytics. Expert Syst Appl. 2022;205: 117761.

5. Hossain ME, Khan A, Uddin S. Understanding the comorbidity of multiple chronic diseases using a network approach. In Proc Austral Comput Sci Week Multiconference. 2019;1–7.