Chat Generative Pretraining Transformer Answers Patient-focused Questions in Cervical Spine Surgery-Reference-Cited by-同舟云学术

Chat Generative Pretraining Transformer Answers Patient-focused Questions in Cervical Spine Surgery

Published:2024-03-21 Issue:6 Volume:37 Page:E278-E281
ISSN:2380-0186
Container-title:Clinical Spine Surgery
language:en
Short-container-title:

Author:

Subramanian Tejas¹²,Araghi Kasra¹,Amen Troy B.¹,Kaidi Austin¹,Sosa Branden²,Shahi Pratyush¹,Qureshi Sheeraz¹²,Iyer Sravisht¹²

Affiliation:

1. Department of Orthopedic Surgery, Hospital for Special Surgery

2. Weill Cornell Medicine, New York, NY

Abstract

Study Design: Review of Chat Generative Pretraining Transformer (ChatGPT) outputs to select patient-focused questions. Objective: We aimed to examine the quality of ChatGPT responses to cervical spine questions. Background: Artificial intelligence and its utilization to improve patient experience across medicine is seeing remarkable growth. One such usage is patient education. For the first time on a large scale, patients can ask targeted questions and receive similarly targeted answers. Although patients may use these resources to assist in decision-making, there still exists little data regarding their accuracy, especially within orthopedic surgery and more specifically spine surgery. Methods: We compiled 9 frequently asked questions cervical spine surgeons receive in the clinic to test ChatGPT’s version 3.5 ability to answer a nuanced topic. Responses were reviewed by 2 independent reviewers on a Likert Scale for the accuracy of information presented (0–5 points), appropriateness in giving a specific answer (0–3 points), and readability for a layperson (0–2 points). Readability was assessed through the Flesh-Kincaid grade level analysis for the original prompt and for a second prompt asking for rephrasing at the sixth-grade reading level. Results: On average, ChatGPT’s responses scored a 7.1/10. Accuracy was rated on average a 4.1/5. Appropriateness was 1.8/3. Readability was a 1.2/2. Readability was determined to be at the 13.5 grade level originally and at the 11.2 grade level after prompting. Conclusions: ChatGPT has the capacity to be a powerful means for patients to gain important and specific information regarding their pathologies and surgical options. These responses are limited in their accuracy, and we, in addition, noted readability is not optimal for the average patient. Despite these limitations in ChatGPT’s capability to answer these nuanced questions, the technology is impressive, and surgeons should be aware patients will likely increasingly rely on it.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Reference17 articles.

1. Readability of patient-oriented online dermatology resources;Tulbert;J Clin Aesthet Dermatol,2011

2. Situating Wikipedia as a health information resource in various contexts: a scoping review;Smith;PLoS One,2020

3. Using artificial intelligence to answer common patient-focused questions in minimally invasive spine surgery;Subramanian;J Bone Joint Surg Am,2023

4. Provider referral patterns and surgical utilization among new patients seen in spine clinic;Araghi;Spine (Phila Pa 1976),2023

5. NDI <21 denotes patient acceptable symptom state after degenerative cervical spine surgery;Shahi;Spine (Phila Pa 1976),2023

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Currently Available Large Language Models Do Not Provide Musculoskeletal Treatment Recommendations That Are Concordant With Evidence-Based Clinical Practice Guidelines;Arthroscopy: The Journal of Arthroscopic & Related Surgery;2024-08