Author:
Ouf Shimaa,ElSeddawy Ahmed I. B.
Abstract
The data mining techniques-based systems could have a crucial impact on the employees’ lifestyle to predict heart diseases. There are many scientific papers, which use the techniques of data mining to predict heart diseases. However, limited scientific papers have addressed the four cross-validation techniques of splitting the data set that plays an important role in selecting the best technique for predicting heart disease. It is important to choose the optimal combination between the cross-validation techniques and the data mining, classification techniques that can enhance the performance of the prediction models. This paper aims to apply the four-cross-validation techniques (holdout, k-fold cross-validation, stratified k fold cross-validation, and repeated random) with the eight data mining, classification techniques (Linear Discriminant Analysis, Logistic regression, Support Vector Model, KNN, Decision Tree, Naïve Bayes, Random Forest, and Neural Network) to improve the accuracy of heart disease prediction and select the best prediction models. It analyzes these techniques on a small and large dataset collected from different data sources like Kaggle and the UCI machine-learning repository. The evaluation metrics like accuracy, precision, recall, and F-measure were used to measure the performance of prediction models. Experimentation is performed on two datasets, and the results show that when the dataset is colossal (70000 records), the optimal combination that achieves the highest accuracy is holdout cross-validation with the neural network with an accuracy of 71.82%. At the same time, Repeated Random with Random Forest considers the optimal combination in a small dataset (303 records) with an accuracy of 89.01%. The best models will be recommended to the physicians in business organizations to help them predicting heart disease in employees into one of two categories, cardiac and non-cardiac, at an early stage. The early detection of heart diseases in employees will improve productivity in the business organization.
Publisher
Southwest Jiaotong University
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Heart Disease Prediction Using Weighted K-Nearest Neighbor Algorithm;Operations Research Forum;2024-08-17
2. Automated Smart Prediction of Heart Disease Using Data Mining;Green Industrial Applications of Artificial Intelligence and Internet of Things;2024-07-21
3. Optimization heart disease prediction using independent component analysis and support vector machine;International Journal of Current Innovations in Advanced Research;2024-04-11
4. Heart Disease Detection Using AI;International Journal of Innovative Science and Research Technology (IJISRT);2024-03-12
5. Smart Artificial Intelligence System for Heart Disease Prediction;International Journal of Engineering and Advanced Technology;2024-02-28