Affiliation:
1. College of Computer Science and Electronic Engineering, Hunan University, Changsha, P. R. China
2. College of Computer, National University of Defense Technology, Changsha, P. R. China
Abstract
Engineers use software defect prediction (SDP) to locate vulnerable areas of software. Recently, statement-level SDP has attracted the attention of researchers due to its ability to localize faulty code areas. This paper proposes DP-Tramo, a new model dedicated to improving the state-of-the-art statement-level SDP. We use Clang to extract abstract syntax trees from source code and extract 32 statement-level metrics as static features for each sentence. Then we feed static features and token sequences as inputs to our improved R-Transformer to learn the syntactic and semantic features of the code. Furthermore, we use label smoothing and weighted loss to improve the performance of DP-Tramo. To evaluate DP-Tramo, we perform a 10-fold cross-validation on 119,989 C/C++ programs selected from Code4Bench. Experimental results show that DP-Tramo can classify the dataset with an average performance of 0.949, 0.602, 0.734 and 0.737 regarding the recall, precision, accuracy and F1-measure, respectively. DP-Tramo outperforms the baseline method on F1-measure by 1.2% while maintaining a high recall rate.
Funder
National Natural Science Foundation of China
Publisher
World Scientific Pub Co Pte Ltd
Subject
Electrical and Electronic Engineering,Hardware and Architecture,Media Technology