Abstract
Oversampling ratio N and the minority class’ nearest neighboring number k are key hyperparameters of synthetic minority oversampling technique (SMOTE) to reconstruct the class distribution of dataset. No optimal default value exists there. Therefore, it is of necessity to discuss the influence of the output dataset on the classification performance when SMOTE adopts various hyperparameter combinations. In this paper, we propose a hyperparameter optimization algorithm for imbalanced data. By iterating to find reasonable N and k for SMOTE, so as to build a balanced and high-quality dataset. As a result, a model with outstanding performance and strong generalization ability is trained, thus effectively solving imbalanced classification. The proposed algorithm is based on the hybridization of simulated annealing mechanism (SA) and particle swarm optimization algorithm (PSO). In the optimization, Cohen’s Kappa is used to construct the fitness function, and AdaRBFNN, a new classifier, is integrated by multiple trained RBF neural networks based on AdaBoost algorithm. Kappa of each generation is calculated according to the classification results, so as to evaluate the quality of candidate solution. Experiments are conducted on seven groups of KEEL datasets. Results show that the proposed algorithm delivers excellent performance and can significantly improve the classification accuracy of the minority class.
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Theoretical Computer Science
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献