IAPCP: An Effective Cross‐Project Defect Prediction Model via Intra‐Domain Alignment and Programming‐Based Distribution Adaptation-Reference-Cited by-同舟云学术

IAPCP: An Effective Cross‐Project Defect Prediction Model via Intra‐Domain Alignment and Programming‐Based Distribution Adaptation

Published:2024-01 Issue:1 Volume:2024 Page:
ISSN:1751-8806
Container-title:IET Software
language:en
Short-container-title:IET Software

Author:

Zhang Nana^ORCID,Zhu Kun^ORCID,Zhu Dandan^ORCID

Abstract

Cross‐project defect prediction (CPDP) aims to identify defect‐prone software instances in one project (target) using historical data collected from other software projects (source), which can help maintainers allocate limited testing resources reasonably. Unfortunately, the feature distribution discrepancy between the source and target projects makes it challenging to transfer the matching feature representation and severely hinders CPDP performance. Besides, existing CPDP models require an intensively expensive and time‐consuming process to tune a lot of parameters. To address the above limitations, we propose an effective CPDP model named IAPCP based on distribution adaptation in this study, which consists of two stages: correlation alignment and intra‐domain programming. Correlation alignment first calculates the covariance matrices of the source and target projects and then erases some features of the source project (i.e., whitening operation) and employs the features of the target project (i.e., target covariance) to fill the source project, thereby well aligning the source and target feature distributions and reducing the distribution discrepancy across projects. Intra‐domain programming can directly learn a nonparametric linear transfer defect predictor with strong discriminative capacity by solving a probabilistic annotation matrix (PAM) based on the adjusted features of the source project. The model does not require model selection and parameter tuning. Extensive experiments on a total of 82 cross‐project pairs from 16 software projects demonstrate that IAPCP can achieve competitive CPDP effectiveness and efficiency compared with multiple state‐of‐the‐art baseline models.

Funder

Fundamental Research Funds for the Central Universities

National Natural Science Foundation of China

Key Laboratory of Embedded System and Service Computing Ministry of Education

Publisher

Institution of Engineering and Technology (IET)

Reference49 articles.

1. HYDRA: Massively Compositional Model for Cross-Project Defect Prediction

2. IVKMP: A robust data-driven heterogeneous defect model based on deep representation optimization learning

3. IMDAC: A robust intelligent software defect prediction model via multi‐objective optimization and end‐to‐end hybrid deep learning networks

4. An investigation on the feasibility of cross-project defect prediction

5. WGNCS: A robust hybrid cross-version defect model via multi-objective optimization and deep enhanced feature representation