Constructing meaningful code changes via graph transformer

Author:

Guo Shikai12ORCID,Li Mengxuan1,Ge Xin1,Li Hui1,Chen Rong1ORCID,Li Tingting3

Affiliation:

1. The College of Information Science and Technology Dalian Maritime University Dalian China

2. The Key Laboratory for Artificial Intelligence of Dalian Dalian China

3. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University Changchun China

Abstract

AbstractThe rapid development of Open‐Source Software (OSS) has resulted in a significant demand for code changes to maintain OSS. Symptoms of poor design and implementation choices in code changes often occur, thus heavily hindering code reviewers to verify correctness and soundness of code changes. Researchers have investigated how to learn meaningful code changes to assist developers in anticipating changes that code reviewers may suggest for the submitted code. However, there are two main limitations to be addressed, including the limitation of long‐range dependencies of the source code and the missing syntactic structural information of the source code. To solve these limitations, a novel method is proposed, named Graph Transformer for learning meaningful Code Transformations (GTCT), to provide developers with preliminary and quick feedback when developers submit code changes, which can improve the quality of code changes and improve the efficiency of code review. GTCT comprises two components: code graph embedding and code transformation learning. To address the missing syntactic structural information of the source code limitation, the code graph embedding component captures the types and patterns of code changes by encoding the source code into a code graph structure from the lexical and syntactic representations of the source code. Subsequently, the code transformation learning component uses the multi‐head attention mechanism and positional encoding mechanism to address the long‐range dependencies limitation. Extensive experiments are conducted to evaluate the performance of GTCT by both quantitative and qualitative analyses. For the quantitative analysis, GTCT relatively outperforms the baseline on six datasets by 210%, 342.86%, 135%, 29.41%, 109.09%, and 91.67% in terms of perfect prediction. Meanwhile, the qualitative analysis shows that each type of code change by GTCT outperforms that of the baseline method in terms of bug fixed, refactoring code and others' taxonomy of code changes.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Liaoning Province

Publisher

Institution of Engineering and Technology (IET)

Subject

Computer Graphics and Computer-Aided Design

Reference37 articles.

1. Gerrit‐google Source.https://gerrit‐review.googlesource.com/(2021)

2. Gerrit‐ovirt.http://www.gerrit.ovirt.org/(2021)

3. Quality and productivity outcomes relating to continuous integration in GitHub

4. A Mixed Methods Approach to Mining Code Review Data

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3