FQN Inference in Partial Code by Prompt-tuned Language Model of Code

Author:

Huang Qing1ORCID,Yuan Zhiqiang1ORCID,Xing Zhenchang2ORCID,Peng Xin3ORCID,Xu Xiwei4ORCID,Lu Qinghua4ORCID

Affiliation:

1. Jiangxi Normal University, China

2. CSIRO’s Data61 & Australian National University, Australia

3. Fudan University, China

4. CSIRO’s Data61, Australia

Abstract

Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search and reuse of partial code. Existing dictionary-lookup based methods build a symbolic knowledge base of API names and code contexts, which involve significant compilation overhead and are sensitive to unseen API names and code context variations. In this article, we propose using a p rompt-tuned c o de m asked language mod e l (MLM) as a neural knowledge base for type inference, called POME, which is lightweight and has minimal requirements on code compilation. Unlike the existing symbol name and context matching for type inference, POME infers the FQNs syntax and usage knowledge encapsulated in prompt-tuned code MLM through a colze-style fill-in-blank strategy. POME is integrated as a plug-in into web and integrated development environments (IDE) to assist developers in inferring FQNs in the real world. We systematically evaluate POME on a large amount of source code from GitHub and Stack Overflow, and explore its generalization and hybrid capability. The results validate the effectiveness of the POME design and its applicability for partial code type inference, and they can be easily extended to different programming languages (PL). POME can also be used to generate a PL-hybrid type inference model for providing a one-for-all solution. As the first of its kind, our neural type inference method opens the door to many innovative ways of using partial code.

Funder

National Natural Science Foundation of China

Graduate Innovative Special Fund Projects of Jiangxi Province

Publisher

Association for Computing Machinery (ACM)

Subject

Software

Reference97 articles.

1. Learning from examples to find fully qualified names of API elements in code snippets;Saifullah C. M. Khaled;Proceedings of the 2019 34th IEEE/ACM International Conference on Automated Software Engineering.,2019

2. JCoffee: Using compiler feedback to make partial code snippets compilable;Gupta Piyush Kumar;Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution,2020

3. Parseweb

4. Subhadip Maji Swapna Sourav Rout and Sudeep Choudhary. 2021. Dcom: A deep column mapper for semantic data type detection. CoRR abs/2106.12871 2021.

5. Are code examples on an online Q&A forum reliable?: A study of API misuse on stack overflow;Zhang Tianyi;Proceedings of the 2018 IEEE/ACM 40th International Conference on Software Engineering.,2018

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3