An Investigation of Imbalanced Ensemble Learning Methods for Cross-Project Defect Prediction-Reference-Cited by-同舟云学术

An Investigation of Imbalanced Ensemble Learning Methods for Cross-Project Defect Prediction

Published:2019-11 Issue:12 Volume:33 Page:1959037
ISSN:0218-0014
Container-title:International Journal of Pattern Recognition and Artificial Intelligence
language:en
Short-container-title:Int. J. Patt. Recogn. Artif. Intell.

Author:

Qiu Shaojian¹^ORCID,Lu Lu¹,Jiang Siyu²,Guo Yang¹

Affiliation:

1. School of Computer Science and Engineering, South China University of Technology, Guangzhou 510000, P. R. China

2. School of Software Engineering, South China University of Technology, Guangzhou 510000, P. R. China

Abstract

Machine-learning-based software defect prediction (SDP) methods are receiving great attention from the researchers of intelligent software engineering. Most existing SDP methods are performed under a within-project setting. However, there usually is little to no within-project training data to learn an available supervised prediction model for a new SDP task. Therefore, cross-project defect prediction (CPDP), which uses labeled data of source projects to learn a defect predictor for a target project, was proposed as a practical SDP solution. In real CPDP tasks, the class imbalance problem is ubiquitous and has a great impact on performance of the CPDP models. Unlike previous studies that focus on subsampling and individual methods, this study investigated 15 imbalanced learning methods for CPDP tasks, especially for assessing the effectiveness of imbalanced ensemble learning (IEL) methods. We evaluated the 15 methods by extensive experiments on 31 open-source projects derived from five datasets. Through analyzing a total of 37504 results, we found that in most cases, the IEL method that combined under-sampling and bagging approaches will be more effective than the other investigated methods.

Funder

National Nature Science Foundation of China

Guangdong Province Application Major Fund

Guangzhou Produce & Research Fund

Zhongshan Produce & Research Fund

Publisher

World Scientific Pub Co Pte Lt

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218001419590377

Reference40 articles.

1. Assessing the accuracy of prediction algorithms for classification: an overview

2. New Applications of Ensembles of Classifiers

3. Bagging predictors

4. SMOTE: Synthetic Minority Over-sampling Technique

5. SMOTEBoost: Improving Prediction of the Minority Class in Boosting

Cited by 25 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Identification of Software Bugs by Analyzing Natural Language-Based Requirements Using Optimized Deep Learning Features;Computers, Materials & Continua;2024

2. Hyperparameter Optimization for Software Bug Prediction Using Ensemble Learning;IEEE Access;2024

3. DP-CCL: A Supervised Contrastive Learning Approach Using CodeBERT Model in Software Defect Prediction;IEEE Access;2024

4. Software Defect Prediction Using Deep Semantic Feature Learning;2023 International Conference on Evolutionary Algorithms and Soft Computing Techniques (EASCT);2023-10-20

5. Adversarial domain adaptation for cross-project defect prediction;Empirical Software Engineering;2023-09