Enhancing Cardiovascular Disease Prediction: A Domain Knowledge-Based Feature Selection and Stacked Ensemble Machine Learning Approach-Reference-Cited by-同舟云学术

Enhancing Cardiovascular Disease Prediction: A Domain Knowledge-Based Feature Selection and Stacked Ensemble Machine Learning Approach

Published:2023-06-26 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Rustamov Zahiriddin¹,Rustamov Jaloliddin¹,Zaki Nazar¹,Turaev Sherzod¹,Sultana Most Sarmin²,Tan Jeanne Ywei²,Balakrishnan Vimala²

Affiliation:

1. United Arab Emirates University

2. University Malaya

Abstract

AbstractCardiovascular diseases (CVDs) are prevalent disorders affecting the heart or blood arteries. Early disease detection significantly enhances survival prospects, thus emphasizing the necessity for accurate prediction methods. Emerging technologies, such as machine learning (ML), present promising avenues for more precise prediction of CVDs. However, a critical challenge lies in developing models that not only ensure optimal predictive performance but also conform to well-established domain knowledge, thereby enhancing their credibility. Single classifiers often fall short due to issues like overfitting and bias. In response, this study proposes a domain knowledge-based feature selection integrated with a stacking ensemble classifier. The Framingham Heart Study, UCI Heart Disease and UAE retrospective cohort study datasets were utilized for training and evaluation of the ML algorithms. The results indicate that the proposed domain knowledge-based feature selection performs on par with frequently adopted feature selection techniques. Moreover, the proposed stacked ensemble, in conjunction with domain knowledge-based feature selection, achieved the highest metrics with 89.66% accuracy, and 89.16% F1-score on the Framingham dataset. Similarly, the proposed method achieved an F1-score of 85.26% and 96.23% on the UCI Heart Disease and UAE datasets. Furthermore, this study employs explainable AI techniques to illuminate the decision-making process of the predictive models. Thus, the study establishes that domain knowledge-based feature selection promotes the credibility of ML models without compromising predictive performance.

Publisher

Research Square Platform LLC

Reference54 articles.

1. World Health Organization Cardiovascular Diseases (CVDs) Available online: https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)

2. Doppala BP, Bhattacharyya D, Janarthanan M, Baik NA (2022) Reliable Machine Intelligence Model for Accurate Identification of Cardiovascular Diseases Using Ensemble Techniques. J. Healthc. Eng. 2022, doi:10.1155/2022/2585235

3. Clustering and Association Rule Mining of Cardiovascular Disease Risk Factors;Rustamov Z,2022

4. Abnane, I. A Systematic Mapping Study for Ensemble Classification Methods in Cardiovascular Disease;Hosni M;Artif Intell Rev,2021

5. Prediction of Heart Diseases Using Data Mining Techniques;Masih N;Int J Big Data Anal Healthc,2018