Enhancing Program Synthesis with Large Language Models Using Many-Objective Grammar-Guided Genetic Programming

Author:

Tao Ning1ORCID,Ventresque Anthony2ORCID,Nallur Vivek1ORCID,Saber Takfarinas3ORCID

Affiliation:

1. School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland

2. School of Computer Science and Statistics, Trinity College Dublin, D02 PN40 Dublin, Ireland

3. School of Computer Science, University of Galway, H91 TK33 Galway, Ireland

Abstract

The ability to automatically generate code, i.e., program synthesis, is one of the most important applications of artificial intelligence (AI). Currently, two AI techniques are leading the way: large language models (LLMs) and genetic programming (GP) methods—each with its strengths and weaknesses. While LLMs have shown success in program synthesis from a task description, they often struggle to generate the correct code due to ambiguity in task specifications, complex programming syntax, and lack of reliability in the generated code. Furthermore, their generative nature limits their ability to fix erroneous code with iterative LLM prompting. Grammar-guided genetic programming (G3P, i.e., one of the top GP methods) has been shown capable of evolving programs that fit a defined Backus–Naur-form (BNF) grammar based on a set of input/output tests that help guide the search process while ensuring that the generated code does not include calls to untrustworthy libraries or poorly structured snippets. However, G3P still faces issues generating code for complex tasks. A recent study attempting to combine both approaches (G3P and LLMs) by seeding an LLM-generated program into the initial population of the G3P has shown promising results. However, the approach rapidly loses the seeded information over the evolutionary process, which hinders its performance. In this work, we propose combining an LLM (specifically ChatGPT) with a many-objective G3P (MaOG3P) framework in two parts: (i) provide the LLM-generated code as a seed to the evolutionary process following a grammar-mapping phase that creates an avenue for program evolution and error correction; and (ii) leverage many-objective similarity measures towards the LLM-generated code to guide the search process throughout the evolution. The idea behind using the similarity measures is that the LLM-generated code is likely to be close to the correct fitting code. Our approach compels any generated program to adhere to the BNF grammar, ultimately mitigating security risks and improving code quality. Experiments on a well-known and widely used program synthesis dataset show that our approach successfully improves the synthesis of grammar-fitting code for several tasks.

Funder

Science Foundation Ireland

Publisher

MDPI AG

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3