Unveiling the Potential of AI in Plastic Surgery Education: A Comparative Study of Leading AI Platforms’ Performance on In-training Examinations-Reference-Cited by-同舟云学术

Unveiling the Potential of AI in Plastic Surgery Education: A Comparative Study of Leading AI Platforms’ Performance on In-training Examinations

Published:2024-06 Issue:6 Volume:12 Page:e5929
ISSN:2169-7574
Container-title:Plastic and Reconstructive Surgery - Global Open
language:en
Short-container-title:

Author:

DiDonna Nicole¹,Shetty Pragna N.²,Khan Kamran²,Damitz Lynn²

Affiliation:

1. School of Medicine, University of North Carolina, Chapel Hill, N.C.

2. Division of Plastic and Reconstructive Surgery, University of North Carolina, Chapel Hill, N.C.

Abstract

Background: Within the last few years, artificial intelligence (AI) chatbots have sparked fascination for their potential as an educational tool. Although it has been documented that one such chatbot, ChatGPT, is capable of performing at a moderate level on plastic surgery examinations and has the capacity to become a beneficial educational tool, the potential of other chatbots remains unexplored. Methods: To investigate the efficacy of AI chatbots in plastic surgery education, performance on the 2019–2023 Plastic Surgery In-service Training Examination (PSITE) was compared among seven popular AI platforms: ChatGPT-3.5, ChatGPT-4.0, Google Bard, Google PaLM, Microsoft Bing AI, Claude, and My AI by Snapchat. Answers were evaluated for accuracy and incorrect responses were characterized by question category and error type. Results: ChatGPT-4.0 outperformed the other platforms, reaching accuracy rates up to 79%. On the 2023 PSITE, ChatGPT-4.0 ranked in the 95th percentile of first-year residents; however, relative performance worsened when compared with upper-level residents, with the platform ranking in the 12th percentile of sixth-year residents. The performance among other chatbots was comparable, with their average PSITE score (2019–2023) ranging from 48.6% to 57.0%. Conclusions: Results of our study indicate that ChatGPT-4.0 has potential as an educational tool in the field of plastic surgery; however, given their poor performance on the PSITE, the use of other chatbots should be cautioned against at this time. To our knowledge, this is the first article comparing the performance of multiple AI chatbots within the realm of plastic surgery education.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Reference48 articles.

1. Unleashing the power of ChatGPT: revolutionizing plastic surgery and beyond.;Bassiri-Tehrani;Aesthet Surg J,2023

2. Large language models in medical education: opportunities, challenges, and future directions.;Abd-Alrazaq;JMIR Med Educ,2023

3. GPT-4.

4. Pathways Language Model (PaLM): scaling to 540 billion parameters for breakthrough performance.;Narang

5. AI across Google: PaLM 2.