Constructing meaningful code changes via graph transformer-Reference-Cited by-同舟云学术

Constructing meaningful code changes via graph transformer

Published:2023-01-21 Issue:2 Volume:17 Page:154-167
ISSN:1751-8806
Container-title:IET Software
language:en
Short-container-title:IET Software

Author:

Guo Shikai¹²^ORCID,Li Mengxuan¹,Ge Xin¹,Li Hui¹,Chen Rong¹^ORCID,Li Tingting³

Affiliation:

1. The College of Information Science and Technology Dalian Maritime University Dalian China

2. The Key Laboratory for Artificial Intelligence of Dalian Dalian China

3. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University Changchun China

Abstract

AbstractThe rapid development of Open‐Source Software (OSS) has resulted in a significant demand for code changes to maintain OSS. Symptoms of poor design and implementation choices in code changes often occur, thus heavily hindering code reviewers to verify correctness and soundness of code changes. Researchers have investigated how to learn meaningful code changes to assist developers in anticipating changes that code reviewers may suggest for the submitted code. However, there are two main limitations to be addressed, including the limitation of long‐range dependencies of the source code and the missing syntactic structural information of the source code. To solve these limitations, a novel method is proposed, named Graph Transformer for learning meaningful Code Transformations (GTCT), to provide developers with preliminary and quick feedback when developers submit code changes, which can improve the quality of code changes and improve the efficiency of code review. GTCT comprises two components: code graph embedding and code transformation learning. To address the missing syntactic structural information of the source code limitation, the code graph embedding component captures the types and patterns of code changes by encoding the source code into a code graph structure from the lexical and syntactic representations of the source code. Subsequently, the code transformation learning component uses the multi‐head attention mechanism and positional encoding mechanism to address the long‐range dependencies limitation. Extensive experiments are conducted to evaluate the performance of GTCT by both quantitative and qualitative analyses. For the quantitative analysis, GTCT relatively outperforms the baseline on six datasets by 210%, 342.86%, 135%, 29.41%, 109.09%, and 91.67% in terms of perfect prediction. Meanwhile, the qualitative analysis shows that each type of code change by GTCT outperforms that of the baseline method in terms of bug fixed, refactoring code and others' taxonomy of code changes.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Liaoning Province

Publisher

Institution of Engineering and Technology (IET)

Subject

Computer Graphics and Computer-Aided Design

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1049/sfw2.12097

Reference37 articles.

1. Gerrit‐google Source.https://gerrit‐review.googlesource.com/(2021)

2. Gerrit‐ovirt.http://www.gerrit.ovirt.org/(2021)

3. Quality and productivity outcomes relating to continuous integration in GitHub

4. A Mixed Methods Approach to Mining Code Review Data