Abstract
Imbalanced data classification is a demanding issue in data mining and machine learning. Models that learn with imbalanced input generate feeble performance in the minority class. Resampling methods can handle this issue and balance the skewed dataset. Cluster-based Undersampling (CUS) and Near-Miss (NM) techniques are widely used in imbalanced learning. However, these methods suffer from some serious flaws. CUS averts the impact of the distance factor on instances over the majority class. Near-miss method discards the inter-class data within the majority of class elements. To overcome these flaws, this study has come up with an undersampling technique called Adaptive K-means Clustering Undersampling (AKCUS). The proposed technique blends the distance factor and clustering over the majority class. The performance of the proposed method was analyzed with the aid of an experimental study. Three multiminority datasets with different imbalance ratios were selected and the models were created using K-Nearest Neighbor (kNN), Decision Tree (DT), and Random Forest (RF) classifiers. The experimental results show that AKCUS can attain better efficacy than the benchmark methods over multiminority datasets with high imbalance ratios.
Publisher
Engineering, Technology & Applied Science Research
Reference20 articles.
1. G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, "Learning from class-imbalanced data: Review of methods and applications," Expert Systems with Applications, vol. 73, pp. 220–239, May 2017.
2. S. Tahzeeb and S. Hasan, "A Neural Network-Based Multi-Label Classifier for Protein Function Prediction," Engineering, Technology & Applied Science Research, vol. 12, no. 1, pp. 7974–7981, Feb. 2022.
3. W. M. S. Yafooz, E. A. Hizam, and W. A. Alromema, "Arabic Sentiment Analysis on Chewing Khat Leaves using Machine Learning and Ensemble Methods," Engineering, Technology & Applied Science Research, vol. 11, no. 2, pp. 6845–6848, Apr. 2021.
4. F. Belloum, L. Houichi, and M. Kherouf, "The Performance of Spectral Clustering Algorithms on Water Distribution Networks: Further Evidence," Engineering, Technology & Applied Science Research, vol. 12, no. 4, pp. 9056–9062, Aug. 2022.
5. S. Bhatia, "Adaptive K-Means Clustering," in Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference, Miami Beach, FL, USA, Jan. 2004.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献