Fluent but Not Factual: A Comparative Analysis of ChatGPT and Other AI Chatbots’ Proficiency and Originality in Scientific Writing for Humanities

Author:

Lozić Edisa1ORCID,Štular Benjamin1ORCID

Affiliation:

1. Research Centre of the Slovenian Academy of Sciences and Arts, 1000 Ljubljana, Slovenia

Abstract

Historically, mastery of writing was deemed essential to human progress. However, recent advances in generative AI have marked an inflection point in this narrative, including for scientific writing. This article provides a comprehensive analysis of the capabilities and limitations of six AI chatbots in scholarly writing in the humanities and archaeology. The methodology was based on tagging AI-generated content for quantitative accuracy and qualitative precision by human experts. Quantitative accuracy assessed the factual correctness in a manner similar to grading students, while qualitative precision gauged the scientific contribution similar to reviewing a scientific article. In the quantitative test, ChatGPT-4 scored near the passing grade (−5) whereas ChatGPT-3.5 (−18), Bing (−21) and Bard (−31) were not far behind. Claude 2 (−75) and Aria (−80) scored much lower. In the qualitative test, all AI chatbots, but especially ChatGPT-4, demonstrated proficiency in recombining existing knowledge, but all failed to generate original scientific content. As a side note, our results suggest that with ChatGPT-4, the size of large language models has reached a plateau. Furthermore, this paper underscores the intricate and recursive nature of human research. This process of transforming raw data into refined knowledge is computationally irreducible, highlighting the challenges AI chatbots face in emulating human originality in scientific writing. Our results apply to the state of affairs in the third quarter of 2023. In conclusion, while large language models have revolutionised content generation, their ability to produce original scientific contributions in the humanities remains limited. We expect this to change in the near future as current large language model-based AI chatbots evolve into large language model-powered software.

Funder

European Union’s Horizon Europe research and innovation programme

Slovenian Research and Innovation Agency

Publisher

MDPI AG

Subject

Computer Networks and Communications

Reference115 articles.

1. Li, F.-F., Russ, A., Langlotz, C., Ganguli, S., Landay, J., Michele, E., Ho, D.E., Liangs, P., Brynjolfsson, E., and Manning, C.D. (2023). Generative AI: Perspectives from Stanford HAI. How Do You Think Generative AI Will Affect Your Field and Society Going Forward?, HAI, Stanford University, Human-Centred Artificial Inteligence.

2. Li, F.-F., Russ, A., Langlotz, C., Ganguli, S., Landay, J., Michele, E., Ho, D.E., Liangs, P., Brynjolfsson, E., and Manning, C.D. (2023). Generative AI: Perspectives from Stanford HAI. How Do You Think Generative AI Will Affect Your Field and Society Going Forward?, HAI, Stanford University, Human-Centred Artificial Inteligence.

3. Li, F.-F., Russ, A., Langlotz, C., Ganguli, S., Landay, J., Michele, E., Ho, D.E., Liangs, P., Brynjolfsson, E., and Manning, C.D. (2023). Generative AI: Perspectives from Stanford HAI. How Do You Think Generative AI Will Affect Your Field and Society Going Forward?, HAI, Stanford University, Human-Centred Artificial Inteligence.

4. Eloundou, T., Manning, S., Mishkin, P., and Rock, D. (2023). GPTs Are GPTs: An Early Look at the Labor Market Impact Potential of 5. Large Language Models. arXiv.

5. Bommasani, R., Hudson, D.A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M.S., Bohg, J., Bosselut, A., and Brunskill, E. (2023). On the Opportunities and Risks of Foundation Models, Center for Research on Foundation Models, Stanford University.

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3