Evaluating ChatGPT responses in the context of a 53-year-old male with a femoral neck fracture: a qualitative analysis-Reference-Cited by-同舟云学术

Evaluating ChatGPT responses in the context of a 53-year-old male with a femoral neck fracture: a qualitative analysis

Published:2023-09-30 Issue: Volume: Page:
ISSN:1432-1068
Container-title:European Journal of Orthopaedic Surgery & Traumatology
language:en
Short-container-title:Eur J Orthop Surg Traumatol

Author:

Zhou Yushy^ORCID,Moon Charles,Szatkowski Jan,Moore Derek,Stevens Jarrad

Abstract

Abstract Purpose The integration of artificial intelligence (AI) tools, such as ChatGPT, in clinical medicine and medical education has gained significant attention due to their potential to support decision-making and improve patient care. However, there is a need to evaluate the benefits and limitations of these tools in specific clinical scenarios. Methods This study used a case study approach within the field of orthopaedic surgery. A clinical case report featuring a 53-year-old male with a femoral neck fracture was used as the basis for evaluation. ChatGPT, a large language model, was asked to respond to clinical questions related to the case. The responses generated by ChatGPT were evaluated qualitatively, considering their relevance, justification, and alignment with the responses of real clinicians. Alternative dialogue protocols were also employed to assess the impact of additional prompts and contextual information on ChatGPT responses. Results ChatGPT generally provided clinically appropriate responses to the questions posed in the clinical case report. However, the level of justification and explanation varied across the generated responses. Occasionally, clinically inappropriate responses and inconsistencies were observed in the generated responses across different dialogue protocols and on separate days. Conclusions The findings of this study highlight both the potential and limitations of using ChatGPT in clinical practice. While ChatGPT demonstrated the ability to provide relevant clinical information, the lack of consistent justification and occasional clinically inappropriate responses raise concerns about its reliability. These results underscore the importance of careful consideration and validation when using AI tools in healthcare. Further research and clinician training are necessary to effectively integrate AI tools like ChatGPT, ensuring their safe and reliable use in clinical decision-making.

Funder

University of Melbourne

Publisher

Springer Science and Business Media LLC

Subject

Orthopedics and Sports Medicine,Surgery

Link

https://link.springer.com/content/pdf/10.1007/s00590-023-03742-4.pdf

Reference36 articles.

1. Masters K (2019) Artificial intelligence in medical education. Med Teach 41:976–980. https://doi.org/10.1080/0142159X.2019.1595557

2. Chan KS, Zary N (2019) Applications and challenges of implementing artificial intelligence in medical education: integrative review. JMIR Med Educ 5:e13930. https://doi.org/10.2196/13930

3. Paranjape K, Schinkel M, Nannan Panday R et al (2019) Introducing artificial intelligence training in medical education. JMIR Med Educ 5:e16048. https://doi.org/10.2196/16048