Abstract
Heart disease is a significant global health concern characterized by the heart's inability to adequately pump blood, leading to symptoms like weakness, difficulty breathing, and swollen feet. Detecting heart disease early is crucial, often relying on factors such as age, gender, and pulse rate analysis, as well as electrocardiogram screenings for irregular heartbeats. Risk factors include obesity, smoking, diabetes, high blood pressure, and unhealthy diets, with diabetic individuals facing elevated risks due to accelerated atherosclerosis and high blood sugar levels. Managing heart disease involves lifestyle modifications, medication adherence, and regular medical check-ups. Healthcare systems utilize data mining, machine learning, and clinical decision support systems to analyze extensive databases and predict conditions like heart disease, employing techniques such as supervised and unsupervised learning. Big data applications in healthcare, incorporating genomics data and electronic health records, provide insights into treatment effectiveness and real-time patient data analysis, facilitating personalized medicine and potentially saving lives. This research paper assesses the various components found in diabetes patients’ data to accurately forecast heart disease. It is identified by employing the Correlation-based Feature Subset Selection Technique with Best First Search, which is the most important characteristic for heart disease prediction. It has been discovered that age, gender, blood pressure diastolic, diabetes, smoking, obesity, diet, physical activity, stress, kind of chest pain, history of chest pain, troponin, ECG, and target are the most important factors for detecting heart disease. A variety of artificial intelligence methods are used and contrasted for cardiac disease, including logistic regression, K-nearest neighbor (K-NN), decision trees, random forests, and multilayer perceptrons (MLPs). Compared to using all the input features, K-NN with a subset of the features has the highest accuracy rate (80%).