Affiliation:
1. School of Computer Science and Technology, Soochow University, Suzhou, P. R. China
Abstract
Code summarization is a task that aims at automatically producing descriptions of source code. Recently many deep-learning-based approaches have been proposed to generate accurate code summaries, among which pre-trained models (PTMs) for programming languages have achieved promising results. It is well known that source code written in programming languages is highly structured and unambiguous. Though previous work pre-trained the model with well-design tasks to learn universal representation from a large scale of data, they have not considered structure information during the fine-tuning stage. To make full use of both the pre-trained programming language model and the structure information of source code, we utilize Flow-Augmented Abstract Syntax Tree (FA-AST) of source code for structure information and propose GraphPLBART — Graph-augmented Programming Language and Bi-directional Auto-Regressive Transformer, which can effectively introduce structure information to a well PTM through a cross attention layer. Compared with the best-performing baselines, GraphPLBART still improves by 3.2%, 7.1%, and 1.2% in terms of BLEU, METEOR, and ROUGE-L, respectively, on Java dataset, and also improves by 4.0%, 6.3%, and 2.1% on Python dataset. Further experiment shows that the structure information from FA-AST has significant benefits for the performance of GraphPLBART. In addition, our meticulous manual evaluation experiment further reinforces the superiority of our proposed approach. This demonstrates its remarkable abstract quality and solidifies its position as a promising solution in the field of code summarization.
Funder
National Natural Science Foundation of China
Natural Science Foundation of Jiangsu Higher Education Institutions of China
Undergraduate Training Program for Innovation and Entrepreneurship, Soochow University
Publisher
World Scientific Pub Co Pte Ltd
Subject
Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software