Affiliation:
1. IU International University of Applied Sciences, 99084 Erfurt, Germany
Abstract
Our paper compares the correctness, efficiency, and maintainability of human-generated and AI-generated program code. For that, we analyzed the computational resources of AI- and human-generated program code using metrics such as time and space complexity as well as runtime and memory usage. Additionally, we evaluated the maintainability using metrics such as lines of code, cyclomatic complexity, Halstead complexity and maintainability index. For our experiments, we had generative AIs produce program code in Java, Python, and C++ that solves problems defined on the competition coding website leetcode.com. We selected six LeetCode problems of varying difficulty, resulting in 18 program codes generated by each generative AI. GitHub Copilot, powered by Codex (GPT-3.0), performed best, solving 9 of the 18 problems (50.0%), whereas CodeWhisperer did not solve a single problem. BingAI Chat (GPT-4.0) generated correct program code for seven problems (38.9%), ChatGPT (GPT-3.5) and Code Llama (Llama 2) for four problems (22.2%) and StarCoder and InstructCodeT5+ for only one problem (5.6%). Surprisingly, although ChatGPT generated only four correct program codes, it was the only generative AI capable of providing a correct solution to a coding problem of difficulty level hard. In summary, 26 AI-generated codes (20.6%) solve the respective problem. For 11 AI-generated incorrect codes (8.7%), only minimal modifications to the program code are necessary to solve the problem, which results in time savings between 8.9% and even 71.3% in comparison to programming the program code from scratch.
Reference60 articles.
1. What Makes an AI Device Human-like? The Role of Interaction Quality, Empathy and Perceived Psychological Anthropomorphic Characteristics in the Acceptance of Artificial Intelligence in the Service Industry;Pelau;Comput. Hum. Behav.,2021
2. Kurosu, M. (2018). Proceedings of the Human-Computer Interaction, Springer. Interaction Technologies.
3. Arteaga, D., Arenas, J.J., Paz, F., Tupia, M., and Bruzza, M. (2019, January 19–22). Design of Information System Architecture for the Recommendation of Tourist Sites in the City of Manta, Ecuador through a Chatbot. Proceedings of the 2019 14th Iberian Conference on Information Systems and Technologies (CISTI), Coimbra, Portugal.
4. Falala-Séchet, C., Antoine, L., Thiriez, I., and Bungener, C. (2019, January 2–5). Owlie: A Chatbot that Provides Emotional Support for Coping with Psychological Difficulties. Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, Paris, France.
5. Adiwardana, D., Luong, M.T., So, D.R., Hall, J., Fiedel, N., Thoppilan, R., Yang, Z., Kulshreshtha, A., Nemade, G., and Lu, Y. (2020). Towards a Human-like Open-Domain Chatbot. arXiv.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献