Affiliation:
1. School of Computer Science and Engineering, North Minzu University, Yinchuan, China
Abstract
In reality, the data generated in many fields are often imbalanced, such as fraud detection, network intrusion detection and disease diagnosis. The class with fewer instances in the data is called the minority class, and the minority class in some applications contains the significant information. So far, many classification methods and strategies for binary imbalanced data have been proposed, but there are still many problems and challenges in multi-class imbalanced data that need to be solved urgently. The classification methods for multi-class imbalanced data are analyzed and summarized in terms of data preprocessing methods and algorithm-level classification methods, and the performance of the algorithms using the same dataset is compared separately. In the data preprocessing methods, the methods of oversampling, under-sampling, hybrid sampling and feature selection are mainly introduced. Algorithm-level classification methods are comprehensively introduced in four aspects: ensemble learning, neural network, support vector machine and multi-class decomposition technique. At the same time, all data preprocessing methods and algorithm-level classification methods are analyzed in detail in terms of the techniques used, comparison algorithms, pros and cons, respectively. Moreover, the evaluation metrics commonly used for multi-class imbalanced data classification methods are described comprehensively. Finally, the future directions of multi-class imbalanced data classification are given.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability
Reference61 articles.
1. Online feature selection for high-dimensional class-imbalanced data [J];Peng;Knowledge-Based Systems,2017
2. Predicting disease risks from highly imbalanced data using random forest [J];Khalilia;BMC Medical Informatics and Decision Making,2011
3. Multiclass imbalance problems: Analysis and potential solutions [J];Shuo;IEEE Trans on Systems, Man, and Cybernetics, Part B (Cybernetics),2012
4. Multi-class protein fold classification using a new ensemble machine learning approach [J];Tan;Genome Informatics,2003
5. A review of multi-class classification for imbalanced data [J];Sahare;International Journal of Advanced Computer Research,2012
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献