Studying the Quality of Source Code Generated by Different AI Generative Engines: An Empirical Evaluation-Reference-Cited by-同舟云学术

Studying the Quality of Source Code Generated by Different AI Generative Engines: An Empirical Evaluation

Published:2024-05-24 Issue:6 Volume:16 Page:188
ISSN:1999-5903
Container-title:Future Internet
language:en
Short-container-title:Future Internet

Author:

Tosi Davide¹^ORCID

Affiliation:

1. Department of Theoretical and Applied Sciences, University of Insubria, 21100 Varese, Italy

Abstract

The advent of Generative Artificial Intelligence is opening essential questions about whether and when AI will replace human abilities in accomplishing everyday tasks. This issue is particularly true in the domain of software development, where generative AI seems to have strong skills in solving coding problems and generating software source code. In this paper, an empirical evaluation of AI-generated source code is performed: three complex coding problems (selected from the exams for the Java Programming course at the University of Insubria) are prompted to three different Large Language Model (LLM) Engines, and the generated code is evaluated in its correctness and quality by means of human-implemented test suites and quality metrics. The experimentation shows that the three evaluated LLM engines are able to solve the three exams but with the constant supervision of software experts in performing these tasks. Currently, LLM engines need human-expert support to produce running code that is of good quality.

Publisher

MDPI AG

Link

https://www.mdpi.com/1999-5903/16/6/188/pdf

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Advancing Software Security: Dcodebert for Automatic Vulnerability Detection and Repair;2024