Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries-Reference-Cited by-同舟云学术

Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries

Published:2024-07 Issue:7 Volume:12 Page:
ISSN:2325-9671
Container-title:Orthopaedic Journal of Sports Medicine
language:en
Short-container-title:Orthopaedic Journal of Sports Medicine

Author:

Varady Nathan H.¹,Lu Amy Z.²,Mazzucco Michael²,Dines Joshua S.¹,Altchek David W.²,Williams Riley J.¹,Kunze Kyle N.¹

Affiliation:

1. Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, USA

2. Weill Cornell Medical College, New York, New York, USA

Abstract

Background: The consumer availability and automated response functions of chat generator pretrained transformer (ChatGPT-4), a large language model, poise this application to be utilized for patient health queries and may have a role in serving as an adjunct to minimize administrative and clinical burden. Purpose: To evaluate the ability of ChatGPT-4 to respond to patient inquiries concerning ulnar collateral ligament (UCL) injuries and compare these results with the performance of Google. Study Design: Cross-sectional study. Methods: Google Web Search was used as a benchmark, as it is the most widely used search engine worldwide and the only search engine that generates frequently asked questions (FAQs) when prompted with a query, allowing comparisons through a systematic approach. The query “ulnar collateral ligament reconstruction” was entered into Google, and the top 10 FAQs, answers, and their sources were recorded. ChatGPT-4 was prompted to perform a Google search of FAQs with the same query and to record the sources of answers for comparison. This process was again replicated to obtain 10 new questions requiring numeric instead of open-ended responses. Finally, responses were graded independently for clinical accuracy (grade 0 = inaccurate, grade 1 = somewhat accurate, grade 2 = accurate) by 2 fellowship-trained sports medicine surgeons (D.W.A, J.S.D.) blinded to the search engine and answer source. Results: ChatGPT-4 used a greater proportion of academic sources than Google to provide answers to the top 10 FAQs, although this was not statistically significant (90% vs 50%; P = .14). In terms of question overlap, 40% of the most common questions on Google and ChatGPT-4 were the same. When comparing FAQs with numeric responses, 20% of answers were completely overlapping, 30% demonstrated partial overlap, and the remaining 50% did not demonstrate any overlap. All sources used by ChatGPT-4 to answer these FAQs were academic, while only 20% of sources used by Google were academic ( P = .0007). The remaining Google sources included social media (40%), medical practices (20%), single-surgeon websites (10%), and commercial websites (10%). The mean (± standard deviation) accuracy for answers given by ChatGPT-4 was significantly greater compared with Google for the top 10 FAQs (1.9 ± 0.2 vs 1.2 ± 0.6; P = .001) and top 10 questions with numeric answers (1.8 ± 0.4 vs 1 ± 0.8; P = .013). Conclusion: ChatGPT-4 is capable of providing responses with clinically relevant content concerning UCL injuries and reconstruction. ChatGPT-4 utilized a greater proportion of academic websites to provide responses to FAQs representative of patient inquiries compared with Google Web Search and provided significantly more accurate answers. Moving forward, ChatGPT has the potential to be used as a clinical adjunct when answering queries about UCL injuries and reconstruction, but further validation is warranted before integrated or autonomous use in clinical settings.

Publisher

SAGE Publications

Link

https://journals.sagepub.com/doi/pdf/10.1177/23259671241257516

Reference32 articles.

1. Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum

2. The Global Burden of Musculoskeletal Pain—Where to From Here?

3. Elbow Ulnar Collateral Ligament Injuries: Indications, Management, and Outcomes

4. Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma & Orthopaedics) examination?

5. Using a Google Web Search Analysis to Assess the Utility of ChatGPT in Total Joint Arthroplasty

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Editorial Commentary: The Scope of Medical Research Concerning ChatGPT Remains Limited by Lack of Originality and Leaves More Questions Than Answers;Arthroscopy: The Journal of Arthroscopic & Related Surgery;2024-09