ChatGPT-generated help produces learning gains equivalent to human tutor-authored help on mathematics skills

Author:

Pardos Zachary A.ORCID,Bhandari ShreyaORCID

Abstract

Authoring of help content within educational technologies is labor intensive, requiring many iterations of content creation, refining, and proofreading. In this paper, we conduct an efficacy evaluation of ChatGPT-generated help using a 3 x 4 study design (N = 274) to compare the learning gains of ChatGPT to human tutor-authored help across four mathematics problem subject areas. Participants are randomly assigned to one of three hint conditions (control, human tutor, or ChatGPT) paired with one of four randomly assigned subject areas (Elementary Algebra, Intermediate Algebra, College Algebra, or Statistics). We find that only the ChatGPT condition produces statistically significant learning gains compared to a no-help control, with no statistically significant differences in gains or time-on-task observed between learners receiving ChatGPT vs human tutor help. Notably, ChatGPT-generated help failed quality checks on 32% of problems. This was, however, reducible to nearly 0% for algebra problems and 13% for statistics problems after applying self-consistency, a “hallucination” mitigation technique for Large Language Models.

Funder

Peder Sather Center for Advanced Study

Vice Provost of Undergraduate Education, University of California Berkeley

Institute of Cognitive and Brain Sciences, University of California Berkeley

Publisher

Public Library of Science (PLoS)

Reference71 articles.

1. Gozalo-Brizuela R, Garrido-Merchan EC. ChatGPT is not all you need. A State of the Art Review of large Generative AI models. arXiv preprint arXiv:230104655; 2023.

2. ChatGPT is fun, but not an author;HH Thorp;Science,2023

3. Fütterer T, Fischer C, Alekseeva A, Chen X, Tate T, Warschauer M, et al. ChatGPT in Education: Global Reactions to AI Innovations; 2023. Available from: https://doi.org/10.21203/rs.3.rs-2840105/v1.

4. ChatGPT: Bullshit spewer or the end of traditional assessments in higher education?;J Rudolph;Journal of Applied Learning and Teaching,2023

5. What is the impact of ChatGPT on education? A rapid review of the literature;CK Lo;Education Sciences,2023

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3