Affiliation:
1. Université de Montréal, Montréal, Canada
Abstract
Code review is a fundamental process in software development that plays a pivotal role in ensuring code quality and reducing the likelihood of errors and bugs. However, code review can be complex, subjective, and time-consuming.
Quality estimation
,
comment generation
, and
code refinement
constitute the three key tasks of this process, and their automation has traditionally been addressed separately in the literature using different approaches. In particular, recent efforts have focused on fine-tuning pre-trained language models to aid in code review tasks, with each task being considered in isolation. We believe that these tasks are interconnected, and their fine-tuning should consider this interconnection. In this paper, we introduce a novel deep-learning architecture, named DISCOREV, which employs cross-task knowledge distillation to address these tasks simultaneously. In our approach, we utilize a cascade of models to enhance both
comment generation
and
code refinement
models. The fine-tuning of the
comment generation
model is guided by the
code refinement
model, while the fine-tuning of the
code refinement
model is guided by the
quality estimation
model. We implement this guidance using two strategies: a feedback-based learning objective and an embedding alignment objective. We evaluate DISCOREV by comparing it to state-of-the-art methods based on independent training and fine-tuning. Our results show that our approach generates better review comments, as measured by the
BLEU
score, as well as more accurate
code refinement
according to the
CodeBLEU
score.
Publisher
Association for Computing Machinery (ACM)
Reference48 articles.
1. 2000. PMD. https://pmd.github.io/
2. 2001. Checkstyle. https://checkstyle.org/
3. 2005. FindBugs. https://findbugs.sourceforge.net/
4. Software inspections: an effective verification process