Graph-based machine learning improves just-in-time defect prediction-Reference-Cited by-同舟云学术

Graph-based machine learning improves just-in-time defect prediction

Published:2023-04-13 Issue:4 Volume:18 Page:e0284077
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Bryan Jonathan,Moriano Pablo^ORCID

Abstract

The increasing complexity of today’s software requires the contribution of thousands of developers. This complex collaboration structure makes developers more likely to introduce defect-prone changes that lead to software faults. Determining when these defect-prone changes are introduced has proven challenging, and using traditional machine learning (ML) methods to make these determinations seems to have reached a plateau. In this work, we build contribution graphs consisting of developers and source files to capture the nuanced complexity of changes required to build software. By leveraging these contribution graphs, our research shows the potential of using graph-based ML to improve Just-In-Time (JIT) defect prediction. We hypothesize that features extracted from the contribution graphs may be better predictors of defect-prone changes than intrinsic features derived from software characteristics. We corroborate our hypothesis using graph-based ML for classifying edges that represent defect-prone changes. This new framing of the JIT defect prediction problem leads to remarkably better results. We test our approach on 14 open-source projects and show that our best model can predict whether or not a code change will lead to a defect with an F1 score as high as 77.55% and a Matthews correlation coefficient (MCC) as high as 53.16%. This represents a 152% higher F1 score and a 3% higher MCC over the state-of-the-art JIT defect prediction. We describe limitations, open challenges, and how this method can be used for operational JIT defect prediction.

Funder

UT-Battelle, LLC

Oak Ridge National Laboratory’s (ORNL’s) Laboratory Directed Research and Development

DOE, Office of Science, Office of Workforce Development for Teachers and Scientists (WDTS) under the Scientific Undergraduate Laboratory Internship (SULI) program

ORNL’s Artificial Intelligence initiative

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference76 articles.

1. Bacchelli A, Bird C. Expectations, outcomes, and challenges of modern code review. In: 35th International Conference on Software Engineering (ICSE); 2013. p. 712–721.

2. An empirical analysis of the impact of software vulnerability announcements on firm stock price;R Telang;IEEE Trans Softw Eng,2007

3. Hassan AE. Predicting faults using the complexity of code changes. In: 31st IEEE International Conference on Software Engineering (ICSE); 2009. p. 78–88.

4. A large-scale empirical study of just-in-time quality assurance;Y Kamei;IEEE Trans Softw Eng,2012

5. A systematic literature review on fault prediction performance in software engineering;T Hall;IEEE Trans Softw Eng,2011

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. SGT: Aging-related bug prediction via semantic feature learning based on graph-transformer;Journal of Systems and Software;2024-11

2. JITGNN: A deep graph neural network framework for Just-In-Time bug prediction;Journal of Systems and Software;2024-01

3. Machine Learning and Deep Learning Techniques to Predict Software Defects: A Bibliometric Analysis, Systematic Review, Challenges and Future Works;2024

4. When Things Changed;Perspectives on Artificial Intelligence in Times of Turbulence;2023-11-24