Abstract
Abstract
Because of uneven distribution and indistinct boundary in imbalanced dataset, imbalanced dataset classification algorithm based on neighbors density support vector machine (NDSVM)is proposed. In this algorithm, the neighbor range density of each sample in the majority class is calculated firstly. According to the density value, the data which on the majority class border or close to the border is equal to the minority samples in quantity, which are selected, then the minority class complete SVM initial classification. Then the resulting support vector machine and residual data in the majority class optimize the initial classifier. The simulation results of experiments on the manual and UCI dataset show that compared with WSVM、 ALSMOTE-SVM and SVM, NDSVM has better classification performance, which effectively improve the classification performance of SVM algorithm on the uneven distribution and indistinct boundary in imbalanced dataset.
Subject
General Physics and Astronomy
Reference8 articles.
1. Review on ensemble algorithms for imbalanced data classification;Li;J. Application Research of Computers,2014
2. Reduction techniques for instance-based learning algorithms;Wilson;J. Machine Learning,2000
3. SMOTE:syn-thetic minority over-sampling techniqueJ;Chawla;Journal of Artifi-cial Intelligence Research,2002