AceCoder : An Effective Prompting Technique Specialized in Code Generation

Author:

Li Jia1ORCID,Zhao Yunfei1ORCID,Li Yongmin1ORCID,Li Ge1ORCID,Jin Zhi1ORCID

Affiliation:

1. Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, China

Abstract

Large Language Models (LLMs) have shown great success in code generation. LLMs take as the input a prompt and output the code. How to make prompts ( i.e., Prompting Techniques ) is a key question. Existing prompting techniques are designed for natural language generation and have low accuracy in code generation. In this paper, we propose a new prompting technique named AceCoder . Our motivation is that code generation meets two unique challenges ( i.e. , requirement understanding and code implementation). AceCoder contains two novel mechanisms ( i.e. , guided code generation and example retrieval) to solve these challenges. ➊ Guided code generation asks LLMs first to analyze requirements and output an intermediate preliminary ( e.g. , test cases). The preliminary clarifies requirements and tells LLMs “what to write” . ➋ Example retrieval selects similar programs as examples in prompts, which provide lots of relevant content ( e.g. , algorithms, APIs) and teach LLMs “how to write” . We apply AceCoder to four LLMs ( e.g. , GPT-3.5, CodeGeeX) and evaluate it on three public benchmarks using the Pass@ k . Results show that AceCoder can significantly improve the performance of LLMs on code generation. In terms of Pass@1, AceCoder outperforms the state-of-the-art baseline by up to 56.4% in MBPP, 70.7% in MBJP, and 88.4% in MBJSP. AceCoder is effective in LLMs with different sizes ( i.e. , 6B to 13B) and different languages ( i.e. , Python, Java, and JavaScript). Human evaluation shows human developers prefer programs from AceCoder .

Publisher

Association for Computing Machinery (ACM)

Reference52 articles.

1. 2022. CodeParrot. https://huggingface.co/codeparrot/codeparrot.

2. 2022. GitHub. https://github.com/.

3. 2022. Lucene. https://lucene.apache.org/.

4. 2022. tree-sitter. https://tree-sitter.github.io/tree-sitter/.

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. When to Stop? Towards Efficient Code Generation in LLMs with Excess Token Prevention;Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis;2024-09-11

2. Structured Chain-of-Thought Prompting for Code Generation;ACM Transactions on Software Engineering and Methodology;2024-08-29

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3