JIT-Smart: A Multi-task Learning Framework for Just-in-Time Defect Prediction and Localization-Reference-Cited by-同舟云学术

JIT-Smart: A Multi-task Learning Framework for Just-in-Time Defect Prediction and Localization

Published:2024-07-12 Issue:FSE Volume:1 Page:1-23
ISSN:2994-970X
Container-title:Proceedings of the ACM on Software Engineering
language:en
Short-container-title:Proc. ACM Softw. Eng.

Author:

Chen Xiangping¹^ORCID,Xu Furen²^ORCID,Huang Yuan²^ORCID,Zhang Neng²^ORCID,Zheng Zibin¹^ORCID

Affiliation:

1. Sun Yat-sen University, Guangzhou, China

2. Sun Yat-sen University, Zhuhai, China

Abstract

Just-in-time defect prediction (JIT-DP) is used to predict the defect-proneness of a commit and just-in-time defect localization (JIT-DL) is used to locate the exact buggy positions (defective lines) in a commit. Recently, various JIT-DP and JIT-DL techniques have been proposed, while most of them use a post-mortem way (e.g., code entropy, attention weight, LIME) to achieve the JIT-DL goal based on the prediction results in JIT-DP. These methods do not utilize the label information of the defective code lines during model building. In this paper, we propose a unified model JIT-Smart, which makes the training process of just-in-time defect prediction and localization tasks a mutually reinforcing multi-task learning process. Specifically, we design a novel defect localization network (DLN), which explicitly introduces the label information of defective code lines for supervised learning in JIT-DL with considering the class imbalance issue. To further investigate the accuracy and cost-effectiveness of JIT-Smart, we compare JIT-Smart with 7 state-of-the-art baselines under 5 commit-level and 5 line-level evaluation metrics in JIT-DP and JIT-DL. The results demonstrate that JIT-Smart is statistically better than all the state-of-the-art baselines in JIT-DP and JIT-DL. In JIT-DP, at the median value, JIT-Smart achieves F1-Score of 0.475, AUC of 0.886, Recall@20%Effort of 0.823, Effort@20%Recall of 0.01 and Popt of 0.942 and improves the baselines by 19.89%-702.74%, 1.23%-31.34%, 9.44%-33.16%, 21.6%-53.82% and 1.94%-34.89%, respectively . In JIT-DL, at the median value, JIT-Smart achieves Top-5 Accuracy of 0.539 and Top-10 Accuracy of 0.396, Recall@20%Effort line of 0.726, Effort@20%Recall line of 0.087 and IFA line of 0.098 and improves the baselines by 101.83%-178.35%, 101.01%-277.31%, 257.88%-404.63%, 71.91%-74.31% and 99.11%-99.41%, respectively. Statistical analysis shows that our JIT-Smart performs more stably than the best-performing model. Besides, JIT-Smart also achieves the best performance compared with the state-of-the-art baselines in cross-project evaluation.

Funder

National Key R&D Program of China

Natural Science Foundation of Guangdong Province

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3643727

Reference66 articles.

1. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models

2. Yoshua Bengio, Réjean Ducharme, and Pascal Vincent. 2000. A neural probabilistic language model. Advances in neural information processing systems, 13 (2000).

3. Leo Breiman. 2001. Random forests. Machine learning, 45, 1 (2001), 5–32.

4. George G Cabral, Leandro L Minku, Emad Shihab, and Suhaib Mujahid. 2019. Class imbalance evolution and verification latency in just-in-time software defect prediction. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 666–676.

5. SMOTE: Synthetic Minority Over-sampling Technique