Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures-Reference-Cited by-同舟云学术

Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures

Published:2024-02-14 Issue: Volume: Page:
ISSN:0148-396X
Container-title:Neurosurgery
language:en
Short-container-title:

Author:

Gajjar Avi A.¹²^ORCID,Kumar Rohit Prem¹,Paliwoda Ethan D.³^ORCID,Kuo Cathleen C.⁴,Adida Samuel¹⁵,Legarreta Andrew D.¹,Deng Hansen¹,Anand Sharath Kumar¹,Hamilton D. Kojo¹,Buell Thomas J.¹,Agarwal Nitin¹,Gerszten Peter C.¹,Hudson Joseph S.¹

Affiliation:

1. Department of Neurological Surgery, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA;

2. Department of Neurological Surgery, Albany Medical College, Albany, New York, USA;

3. Albany Medical College, Albany, New York, USA;

4. Department of Neurological Surgery, Jacobs School of Medicine and Biomedical Sciences at University at Buffalo, New York, New York, USA;

5. University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA

Abstract

BACKGROUND AND OBJECTIVES: The Internet has become a primary source of health information, leading patients to seek answers online before consulting health care providers. This study aims to evaluate the implementation of Chat Generative Pre-Trained Transformer (ChatGPT) in neurosurgery by assessing the accuracy and helpfulness of artificial intelligence (AI)–generated responses to common postsurgical questions. METHODS: A list of 60 commonly asked questions regarding neurosurgical procedures was developed. ChatGPT-3.0, ChatGPT-3.5, and ChatGPT-4.0 responses to these questions were recorded and graded by numerous practitioners for accuracy and helpfulness. The understandability and actionability of the answers were assessed using the Patient Education Materials Assessment Tool. Readability analysis was conducted using established scales. RESULTS: A total of 1080 responses were evaluated, equally divided among ChatGPT-3.0, 3.5, and 4.0, each contributing 360 responses. The mean helpfulness score across the 3 subsections was 3.511 ± 0.647 while the accuracy score was 4.165 ± 0.567. The Patient Education Materials Assessment Tool analysis revealed that the AI-generated responses had higher actionability scores than understandability. This indicates that the answers provided practical guidance and recommendations that patients could apply effectively. On the other hand, the mean Flesch Reading Ease score was 33.5, suggesting that the readability level of the responses was relatively complex. The Raygor Readability Estimate scores ranged within the graduate level, with an average score of the 15th grade. CONCLUSION: The artificial intelligence chatbot's responses, although factually accurate, were not rated highly beneficial, with only marginal differences in perceived helpfulness and accuracy between ChatGPT-3.0 and ChatGPT-3.5 versions. Despite this, the responses from ChatGPT-4.0 showed a notable improvement in understandability, indicating enhanced readability over earlier versions.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Reference21 articles.

1. The application of artificial intelligence in spine surgery;Zhou;Front Surg.,2022

2. Neurosurgery and artificial intelligence;Mofatteh;AIMS Neurosci.,2021

3. Automatic glioma characterization from dynamic susceptibility contrast imaging: brain tumor segmentation using knowledge-based fuzzy clustering;Emblem;J Magn Reson Imaging.,2009

4. Artificial intelligence in the management of intracranial aneurysms: current status and future perspectives;Shi;AJNR Am J Neuroradiol.,2020

5. Automated prediction of the thoracolumbar injury classification and severity score from CT using a novel deep learning algorithm;Doerr;Neurosurg Focus.,2022

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Apple Intelligence in neurosurgery;Neurosurgical Review;2024-07-15

2. In Reply: Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures;Neurosurgery;2024-07-01

3. Letter: Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures;Neurosurgery;2024-07-01

4. ChatGPT Responses to Frequently Asked Questions on Ménière's Disease: A Comparison to Clinical Practice Guideline Answers;OTO Open;2024-07

5. Letter: Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures;Neurosurgery;2024-06-27