Affiliation:
1. Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
2. Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA
Abstract
In the U.S., diagnostic errors are common across various healthcare settings due to factors like complex procedures and multiple healthcare providers, often exacerbated by inadequate initial evaluations. This study explores the role of Large Language Models (LLMs), specifically OpenAI’s ChatGPT-4 and Google Gemini, in improving emergency decision-making in plastic and reconstructive surgery by evaluating their effectiveness both with and without physical examination data. Thirty medical vignettes covering emergency conditions such as fractures and nerve injuries were used to assess the diagnostic and management responses of the models. These responses were evaluated by medical professionals against established clinical guidelines, using statistical analyses including the Wilcoxon rank-sum test. Results showed that ChatGPT-4 consistently outperformed Gemini in both diagnosis and management, irrespective of the presence of physical examination data, though no significant differences were noted within each model’s performance across different data scenarios. Conclusively, while ChatGPT-4 demonstrates superior accuracy and management capabilities, the addition of physical examination data, though enhancing response detail, did not significantly surpass traditional medical resources. This underscores the utility of AI in supporting clinical decision-making, particularly in scenarios with limited data, suggesting its role as a complement to, rather than a replacement for, comprehensive clinical evaluation and expertise.
Reference37 articles.
1. The frequency of diagnostic errors in outpatient care: Estimations from three large observational studies involving US adult populations;Singh;BMJ Qual. Saf.,2014
2. Prevalence of harmful diagnostic errors in hospitalised adults: A systematic review and meta-analysis;Gunderson;BMJ Qual. Saf.,2020
3. Rate of diagnostic errors and serious misdiagnosis-related harms for major vascular events, infections, and cancers: Toward a national incidence estimate using the “Big Three”;Wang;Diagnosis,2021
4. Serious misdiagnosis-related harms in malpractice claims: The “Big Three”—Vascular events, infections, and cancers;Schaffer;Diagnosis,2019
5. McDuff, D., Schaekermann, M., Tu, T., Palepu, A., Wang, A., Garrison, J., Singhal, K., Sharma, Y., Azizi, S., and Kulkarni, K. (2023). Towards accurate differential diagnosis with large language models. arXiv.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献