Affiliation:
1. College of Science, University of Shanghai for Science and Technology, Shanghai, China
2. Department of Control Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
Abstract
Self-training semi-supervised classification has grown in popularity as a research topic. However, when faced with several challenges including outliers, imbalanced class, and incomplete data in reality, the traditional self-training semi-supervised methods might adversely damage the classification accuracy. In this research, we develop a two-step robust semi-supervised self-training classification algorithm that works with imbalanced and incomplete data. The proposed method varies from traditional self-training semi-supervised methods in three major ways: (1) The method in this paper does not necessitate the balance and complete assumption in traditional semi-supervised self-training methods, since it can complete and rebalance the dataset simultaneously. (2) This method is compatible with many classifiers, so it can handle multi-classification and non-linear classification cases. (3) The classifier in this paper is resistant to outliers during semi-supervised classification. Furthermore, several numerical simulations were performed in this research to illustrate the quality of our method to synthesized data, as well as multiple experiments to demonstrate our method superior classification performance on various real datasets.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability