Rocks Coding, Not Development: A Human-Centric, Experimental Evaluation of LLM-Supported SE Tasks-Reference-Cited by-同舟云学术

Rocks Coding, Not Development: A Human-Centric, Experimental Evaluation of LLM-Supported SE Tasks

Published:2024-07-12 Issue:FSE Volume:1 Page:699-721
ISSN:2994-970X
Container-title:Proceedings of the ACM on Software Engineering
language:en
Short-container-title:Proc. ACM Softw. Eng.

Author:

Wang Wei¹^ORCID,Ning Huilong¹^ORCID,Zhang Gaowei¹^ORCID,Liu Libo²^ORCID,Wang Yi¹^ORCID

Affiliation:

1. Beijing University of Posts and Telecommunications, Beijing, China

2. University of Melbourne, Melbourne, Australia

Abstract

Recently, large language models (LLM) based generative AI has been gaining momentum for their impressive high-quality performances in multiple domains, particularly after the release of the ChatGPT. Many believe that they have the potential to perform general-purpose problem-solving in software development and replace human software developers. Nevertheless, there are in a lack of serious investigation into the capability of these LLM techniques in fulfilling software development tasks. In a controlled 2 × 2 between-subject experiment with 109 participants, we examined whether and to what degree working with ChatGPT was helpful in the coding task and typical software development task and how people work with ChatGPT. We found that while ChatGPT performed well in solving simple coding problems, its performance in supporting typical software development tasks was not that good. We also observed the interactions between participants and ChatGPT and found the relations between the interactions and the outcomes. Our study thus provides first-hand insights into using ChatGPT to fulfill software engineering tasks with real-world developers and motivates the need for novel interaction mechanisms that help developers effectively work with large language models to achieve desired outcomes.

Funder

National Natural Science Foundation of China

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3643758

Reference69 articles.

1. A Research Agenda for Hybrid Intelligence: Augmenting Human Intellect With Collaborative, Adaptive, Responsible, and Explainable Artificial Intelligence

2. Human-centered AI: The role of Human-centered Design Research in the development of AI

3. Experimentation in software engineering

4. Ali Borji. 2023. Generated faces in the wild: Quantitative comparison of stable diffusion Midjourney and DALL-E 2. arxiv:2210.00586. arxiv:2210.00586