Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance-Reference-Cited by-同舟云学术

Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance

Published:2020-03-04 Issue:3 Volume:12 Page:407
ISSN:2073-8994
Container-title:Symmetry
language:en
Short-container-title:Symmetry

Author:

Bejjanki Kiran Kumar^ORCID,Gyani Jayadev,Gugulothu Narsimha

Abstract

Software defect prediction (SDP) is the technique used to predict the occurrences of defects in the early stages of software development process. Early prediction of defects will reduce the overall cost of software and also increase its reliability. Most of the defect prediction methods proposed in the literature suffer from the class imbalance problem. In this paper, a novel class imbalance reduction (CIR) algorithm is proposed to create a symmetry between the defect and non-defect records in the imbalance datasets by considering distribution properties of the datasets and is compared with SMOTE (synthetic minority oversampling technique), a built-in package of many machine learning tools that is considered a benchmark in handling class imbalance problems, and with K-Means SMOTE. We conducted the experiment on forty open source software defect datasets from PRedict or Models in Software Engineering (PROMISE) repository using eight different classifiers and evaluated with six performance measures. The results show that the proposed CIR method shows improved performance over SMOTE and K-Means SMOTE.

Publisher

MDPI AG

Subject

Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2073-8994/12/3/407/pdf

Reference40 articles.

1. Exploratory Undersampling for Class-Imbalance Learning

2. SMOTE: Synthetic Minority Over-sampling Technique

3. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary

4. Smote-variants: A python implementation of 85 minority oversampling techniques

Cited by 33 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Rice phenology monitoring via ensemble classification for an extremely imbalanced multiclass dataset of hybrid remote sensing;Remote Sensing Applications: Society and Environment;2024-08

2. An Embedded Machine Learning-Based Spoiled Leftover Food Detection Device for Multiclass Classification;Journal of Information and Communication Technology;2024-04-30

3. Enhancing Software Defect Prediction accuracy using Modified Entropy Calculation in Random Forest Algorithm;Journal of Electrical Systems;2024-03-28

4. A Software Defect Prediction Approach based on Machine Learning;2024 IEEE 7th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC);2024-03-15

5. AI-empowered mobile edge computing: inducing balanced federated learning strategy over edge for balanced data and optimized computation cost;Journal of Cloud Computing;2024-03-04