K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes-Reference-Cited by-同舟云学术

K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes

Published:2020-02-14 Issue:2 Volume:8 Page:89-93
ISSN:2338-0403
Container-title:Jurnal Teknologi dan Sistem Komputer
language:
Short-container-title:Jurnal Teknologi dan Sistem Komputer

Author:

Hairani Hairani¹^ORCID,Saputro Khurniawan Eko¹,Fadli Sofiansyah²

Affiliation:

1. Universitas Bumigora

2. Sekolah Tinggi Manajemen Informatika dan Komputer Lombok

Abstract

The occurrence of imbalanced class in a dataset causes the classification results to tend to the class with the largest amount of data (majority class). A sampling method is needed to balance the minority class (positive class) so that the class distribution becomes balanced and leading to better classification results. This study was conducted to overcome imbalanced class problems on the Indian Pima diabetes illness dataset using k-means-SMOTE. The dataset has 268 instances of the positive class (minority class) and 500 instances of the negative class (majority class). The classification was done by comparing C4.5, SVM, and naïve Bayes while implementing k-means-SMOTE in data sampling. Using k-means-SMOTE, the SVM classification method has the highest accuracy and sensitivity of 82 % and 77 % respectively, while the naive Bayes method produces the highest specificity of 89 %.

Funder

Universitas Bumigora

Publisher

Institute of Research and Community Services Diponegoro University (LPPM UNDIP)

Subject

General Earth and Planetary Sciences,General Environmental Science

Reference11 articles.

1. Synthetic Over Sampling Methods for Handling Class Imbalanced Problems : A Review

2. SMOTE: Synthetic Minority Over-sampling Technique

3. Performance evaluation of class balancing techniques for credit card fraud detection

4. SVM classification: Optimization with the SMOTE algorithm for the class imbalance problem

5. Perbandingan Algoritme Machine Learning untuk Memprediksi Pengambil Matakuliah

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Comparative Analysis of Algorithms Naïve Bayes and C45 for Student Satisfaction with Administrative Services;2023 International Conference of Computer Science and Information Technology (ICOSNIKOM);2023-11-10

2. Social Media Sentiment Analysis for Local Water Company Customers Using a Support Vector Machine Algorithm;2023 10th International Conference on ICT for Smart Society (ICISS);2023-09-06

3. AOH-Senti: Aspect-Oriented Hybrid Approach to Sentiment Analysis of Students’ Feedback;SN Computer Science;2023-01-11

4. Prediksi Siswa Putus Sekolah Swasta Menggunakan Algoritma Bayesian Network (Studi Pada : SMA Islam Al Wahid Kepung);Jurnal Teknologi dan Sistem Komputer;2022-04-30

5. Synthetic minority over-sampling technique nominal continous logistic regression for imbalanced data;AIP Conference Proceedings;2022