How Large Language Models Perform on the United States Medical Licensing Examination: A Systematic Review-Reference-Cited by-同舟云学术

How Large Language Models Perform on the United States Medical Licensing Examination: A Systematic Review

Published:2023-09-03 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Brin Dana^ORCID,Sorin Vera,Konen Eli,Nadkarni Girish,Glicksberg Benjamin S^ORCID,Klang Eyal

Abstract

ABSTRACTObjectiveThe United States Medical Licensing Examination (USMLE) assesses physicians’ competency and passing is a requirement to practice medicine in the U.S. With the emergence of large language models (LLMs) like ChatGPT and GPT-4, understanding their performance on these exams illuminates their potential in medical education and healthcare.Materials and MethodsA literature search following the 2020 PRISMA guidelines was conducted, focusing on studies using official USMLE questions and publicly available LLMs.ResultsThree relevant studies were found, with GPT-4 showcasing the highest accuracy rates of 80-90% on the USMLE. Open-ended prompts typically outperformed multiple-choice ones, with 5-shot prompting slightly edging out zero-shot.ConclusionLLMs, especially GPT-4, display proficiency in tackling USMLE-standard questions. While the USMLE is a structured evaluation tool, it may not fully capture the expansive capabilities and limitations of LLMs in medical scenarios. As AI integrates further into healthcare, ongoing assessments against trusted benchmarks are essential.

Publisher

Cold Spring Harbor Laboratory

Reference21 articles.

1. About the USMLE | USMLE [Internet]. [cited 2023 Aug 2]. Available from: https://www.usmle.org/about-usmle

2. USMLE step 1 and step 2 CK as indicators of resident performance;BMC Med Educ,2023

3. The US Residency Selection Process After the United States Medical Licensing Examination Step 1 Pass/Fail Change: Overview for Applicants and Educators;JMIR Med Educ,2023

4. The USMLE Step 1 Decision: An Opportunity for Medical Education and Training;JAMA,2020

5. Needs, Challenges, and Applications of Artificial Intelligence in Medical Education Curriculum

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Capability of GPT-4V(ision) in Japanese National Medical Licensing Examination;2023-11-08

2. Applications of Large Language Models (LLMs) in Breast Cancer Care;2023-11-04

3. Diagnostic Accuracy of GPT Multimodal Analysis on USMLE Questions Including Text and Visuals;2023-10-31