Abstract
We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making.
Publisher
Public Library of Science (PLoS)
Reference25 articles.
1. Reproducibility in machine learning for health research: Still a ways to go.;MBA McDermott;Sci Transl Med.,2021
2. How to develop machine learning models for healthcare.;P-HC Chen;Nat Mater.,2019
Cited by
1637 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献