A boosted co‐training method for class‐imbalanced learning-Reference-Cited by-同舟云学术

A boosted co‐training method for class‐imbalanced learning

Published:2023-06-12 Issue:9 Volume:40 Page:
ISSN:0266-4720
Container-title:Expert Systems
language:en
Short-container-title:Expert Systems

Author:

Jiang Zhen¹^ORCID,Zhao Lingyun¹,Zhan Yongzhao¹

Affiliation:

1. School of Computer Science and Communication Engineering JiangSu University ZhenJiang China

Abstract

AbstractClass imbalance learning (CIL) has become one of the most challenging research topics. In this article, we propose a Boosted co‐training method to modify the class distribution so that traditional classifiers can be readily adapted to imbalanced datasets. This article is among the first to utilize pseudo‐labelled data of co‐training to enlarge the training set of minority classes. Compared with existing oversampling methods which generate minority samples based on labelled data, the proposed method has the ability to learn from unlabelled data and then decrease the risk of overfitting. Furthermore, we propose a boosting‐style technique which implicitly modifies the class distribution and combines it with co‐training to alleviate the bias towards majority classes. Finally, we collect two series of classifiers generated during Boosted co‐training to build an ensemble for the classification. It further improves the CIL performance by leveraging the strength of ensemble learning. By taking advantage of the diversity of co‐training, we also contribute a new approach to generating base classifiers for ensemble learning. The proposed method is compared with eight state‐of‐the‐art CIL methods on a variety of benchmark datasets. Measured by G‐Mean, F‐Measure, and AUC, Boosted co‐training achieves the best performances and average ranks on 18 benchmark datasets. The experimental results demonstrate the significant superiority of Boosted co‐training over other CIL methods.

Funder

National Natural Science Foundation of China

Publisher

Wiley

Subject

Artificial Intelligence,Computational Theory and Mathematics,Theoretical Computer Science,Control and Systems Engineering

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/exsy.13377

Reference49 articles.

1. Biased Random Forest For Dealing With the Class Imbalance Problem

2. A comparative analysis of gradient boosting algorithms

3. Combining labeled and unlabeled data with co-training

4. Expediting the Accuracy-Improving Process of SVMs for Class Imbalance Learning

5. SMOTE: Synthetic Minority Over-sampling Technique

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Combining Semi-supervised Clustering and Classification Under a Generalized Framework;Journal of Classification;2024-08-13

2. Semi-supervised regression via embedding space mapping and pseudo-label smearing;Applied Intelligence;2024-07-19