From GPT-3.5 to GPT-4.o: A Leap in AI’s Medical Exam Performance

Author:

Kipp Markus1ORCID

Affiliation:

1. Rostock University Medical Center, Institute of Anatomy, 18057 Rostock, Germany

Abstract

ChatGPT is a large language model trained on increasingly large datasets to perform diverse language-based tasks. It is capable of answering multiple-choice questions, such as those posed by diverse medical examinations. ChatGPT has been generating considerable attention in both academic and non-academic domains in recent months. In this study, we aimed to assess GPT’s performance on anatomical multiple-choice questions retrieved from medical licensing examinations in Germany. Two different versions were compared. GPT-3.5 demonstrated moderate accuracy, correctly answering 60–64% of questions from the autumn 2022 and spring 2021 exams. In contrast, GPT-4.o showed significant improvement, achieving 93% accuracy on the autumn 2022 exam and 100% on the spring 2021 exam. When tested on 30 unique questions not available online, GPT-4.o maintained a 96% accuracy rate. Furthermore, GPT-4.o consistently outperformed medical students across six state exams, with a statistically significant mean score of 95.54% compared with the students’ 72.15%. The study demonstrates that GPT-4.o outperforms both its predecessor, GPT-3.5, and a cohort of medical students, indicating its potential as a powerful tool in medical education and assessment. This improvement highlights the rapid evolution of LLMs and suggests that AI could play an increasingly important role in supporting and enhancing medical training, potentially offering supplementary resources for students and professionals. However, further research is needed to assess the limitations and practical applications of such AI systems in real-world medical practice.

Publisher

MDPI AG

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3