Affiliation:
1. School of Computer Science and Communication Engineering Jiangsu University Zhenjiang China
2. Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace Jiangsu University Zhenjiang China
3. Faculty of Computing and Information Systems Ghana Communication Technology University Accra Ghana
Abstract
SummaryAlthough cross‐project defect prediction (CPDP) techniques that use traditional manual features to build defect prediction model have been well‐developed, they usually ignore the semantic and structural information inside the program and fail to capture the hidden features that are critical for program category prediction, resulting in poor defect prediction results. Researchers have proposed using deep learning to automatically extract the semantic features of programs and fuse them with traditional features as training data. However, in practice, it is important to explore the effective representation of the semantic features in the programs and how the fusion of a reasonable ratio between the two types of features can maximize the effectiveness of the model. In this paper, we propose a semantic feature enhancement‐based defect prediction framework (SFE‐DP), which augments the semantic feature set extracted from the program code with data. We also introduce a layer of self‐attentive mechanism and a matching layer to filter low‐efficiency and non‐critical semantic features in the model structure. Finally, we combine the idea of hybrid loss function to iteratively optimize the model parameters. Extensive experiments validate that SFE‐DP can outperform the baseline approaches on 90 pairs of CPDP tasks formed by 10 open‐source projects.
Funder
National Natural Science Foundation of China
National Key Research and Development Program of China
Natural Science Foundation of Jiangsu Province
Qinglan Project of Jiangsu Province of China
Graduate Research and Innovation Projects of Jiangsu Province