Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies-Reference-Cited by-同舟云学术

Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies

Published:2021-09-07 Issue:9 Volume:23 Page:1174
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Yang Chen^ORCID,Liu Yan,Yin Changqing

Abstract

Source Code Generation (SCG) is a prevalent research field in the automation software engineering sector that maps specific descriptions to various sorts of executable code. Along with the numerous intensive studies, diverse SCG types that integrate different scenarios and contexts continue to emerge. As the ultimate purpose of SCG, Natural Language-based Source Code Generation (NLSCG) is growing into an attractive and challenging field, as the expressibility and extremely high abstraction of the input end. The booming large-scale dataset generated by open-source code repositories and Q&A resources, the innovation of machine learning algorithms, and the development of computing capacity make the NLSCG field promising and give more opportunities to the model implementation and perfection. Besides, we observed an increasing interest stream of NLSCG relevant studies recently, presenting quite various technical schools. However, many studies are bound to specific datasets with customization issues, producing occasional successful solutions with tentative technical methods. There is no systematic study to explore and promote the further development of this field. We carried out a systematic literature survey and tool research to find potential improvement directions. First, we position the role of NLSCG among various SCG genres, and specify the generation context empirically via software development domain knowledge and programming experiences; second, we explore the selected studies collected by a thoughtfully designed snowballing process, clarify the NLSCG field and understand the NLSCG problem, which lays a foundation for our subsequent investigation. Third, we model the research problems from technical focus and adaptive challenges, and elaborate insights gained from the NLSCG research backlog. Finally, we summarize the latest technology landscape over the transformation model and depict the critical tactics used in the essential components and their correlations. This research addresses the challenges of bridging the gap between natural language processing and source code analytics, outlines different dimensions of NLSCG research concerns and technical utilities, and shows a bounded technical context of NLSCG to facilitate more future studies in this promising area.

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/23/9/1174/pdf

Reference89 articles.

1. Abstract syntax networks for code generation and semantic parsing;Rabinovich;arXiv,2017

2. Latent predictor networks for code generation;Ling;arXiv,2016

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CoEdPilot: Recommending Code Edits with Learned Prior Edit Relevance, Project-wise Awareness, and Interactive Nature;Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis;2024-09-11

2. Transformers in source code generation: A comprehensive survey;Journal of Systems Architecture;2024-08

3. More than a framework: Sketching out technical enablers for natural language-based source code generation;Computer Science Review;2024-08

4. Generating and Reviewing Programming Codes with Large Language Models: A Systematic Mapping Study;Proceedings of the 20th Brazilian Symposium on Information Systems;2024-05-20

5. A Comparative Review of AI Techniques for Automated Code Generation in Software Development: Advancements, Challenges, and Future Directions;TEM Journal;2024-02-27