An Accurate Identifier Renaming Prediction and Suggestion Approach

Author:

Zhang Jingxuan1ORCID,Luo Junpeng1ORCID,Liang Jiahui1ORCID,Gong Lina1ORCID,Huang Zhiqiu1ORCID

Affiliation:

1. Nanjing University of Aeronautics and Astronautics, China

Abstract

Identifiers play an important role in helping developers analyze and comprehend source code. However, many identifiers exist that are inconsistent with the corresponding code conventions or semantic functions, leading to flawed identifiers. Hence, identifiers need to be renamed regularly. Even though researchers have proposed several approaches to identify identifiers that need renaming and further suggest correct identifiers for them, these approaches only focus on a single or a limited number of granularities of identifiers without universally considering all the granularities and suggest a series of sub-tokens for composing identifiers without completely generating new identifiers. In this article, we propose a novel identifier renaming prediction and suggestion approach. Specifically, given a set of training source code, we first extract all the identifiers in multiple granularities. Then, we design and extract five groups of features from identifiers to capture inherent properties of identifiers themselves and the relationships between identifiers and code conventions, as well as other related code entities, enclosing files, and change history. By parsing the change history of identifiers, we can figure out whether specific identifiers have been renamed or not. These identifier features and their renaming history are used to train a Random Forest classifier, which can be further used to predict whether a given new identifier needs to be renamed or not. Subsequently, for the identifiers that need renaming, we extract all the related code entities and their renaming change history. Based on the intuition that identifiers are co-evolved as their relevant code entities with similar patterns and renaming sequences, we could suggest and recommend a series of new identifiers for those identifiers. We conduct extensive experiments to validate our approach in both the Java projects and the Android projects. Experimental results demonstrate that our approach could identify identifiers that need renaming with an average F-measure of more than 89%, which outperforms the state-of-the-art approach by 8.30% in the Java projects and 21.38% in the Android projects. In addition, our approach achieves a Hit@10 of 48.58% and 40.97% in the Java and Android projects in suggesting correct identifiers and outperforms the state-of-the-art approach by 29.62% and 15.75%, respectively.

Funder

National Natural Science Foundation of China

Fund of Prospective Layout of Scientific Research for NUAA

Publisher

Association for Computing Machinery (ACM)

Subject

Software

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3