Evaluating the Utility of a Large Language Model in Answering Common Patients’ Gastrointestinal Health-Related Questions: Are We There Yet?-Reference-Cited by-同舟云学术

Evaluating the Utility of a Large Language Model in Answering Common Patients’ Gastrointestinal Health-Related Questions: Are We There Yet?

Published:2023-06-02 Issue:11 Volume:13 Page:1950
ISSN:2075-4418
Container-title:Diagnostics
language:en
Short-container-title:Diagnostics

Author:

Lahat Adi¹^ORCID,Shachar Eyal¹,Avidan Benjamin¹,Glicksberg Benjamin²^ORCID,Klang Eyal³

Affiliation:

1. Chaim Sheba Medical Center, Department of Gastroenterology, Affiliated to Tel Aviv University, Tel Aviv 69978, Israel

2. Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

3. The Sami Sagol AI Hub, ARC Innovation Center, Chaim Sheba Medical Center, Affiliated to Tel-Aviv University, Tel Aviv 69978, Israel

Abstract

Background and aims: Patients frequently have concerns about their disease and find it challenging to obtain accurate Information. OpenAI’s ChatGPT chatbot (ChatGPT) is a new large language model developed to provide answers to a wide range of questions in various fields. Our aim is to evaluate the performance of ChatGPT in answering patients’ questions regarding gastrointestinal health. Methods: To evaluate the performance of ChatGPT in answering patients’ questions, we used a representative sample of 110 real-life questions. The answers provided by ChatGPT were rated in consensus by three experienced gastroenterologists. The accuracy, clarity, and efficacy of the answers provided by ChatGPT were assessed. Results: ChatGPT was able to provide accurate and clear answers to patients’ questions in some cases, but not in others. For questions about treatments, the average accuracy, clarity, and efficacy scores (1 to 5) were 3.9 ± 0.8, 3.9 ± 0.9, and 3.3 ± 0.9, respectively. For symptoms questions, the average accuracy, clarity, and efficacy scores were 3.4 ± 0.8, 3.7 ± 0.7, and 3.2 ± 0.7, respectively. For diagnostic test questions, the average accuracy, clarity, and efficacy scores were 3.7 ± 1.7, 3.7 ± 1.8, and 3.5 ± 1.7, respectively. Conclusions: While ChatGPT has potential as a source of information, further development is needed. The quality of information is contingent upon the quality of the online information provided. These findings may be useful for healthcare providers and patients alike in understanding the capabilities and limitations of ChatGPT.

Publisher

MDPI AG

Subject

Clinical Biochemistry

Link

https://www.mdpi.com/2075-4418/13/11/1950/pdf

Reference22 articles.

1. The management of common gastrointestinal disorders in general practice: A survey by the European Society for Primary Care Gastroenterology (ESPCG) in six European countries;Seifert;Dig. Liver Dis.,2008

2. Abdominal symptoms in general practice: Frequency, cancer suspicions raised, and actions taken by GPs in six European countries. Cohort study with prospective registration of cancer;Holtedahl;Heliyon,2017

3. (2023, March 01). Available online: https://openai.com/blog/chatgpt/.

4. Medical Specialty Recommendations by an Artificial Intelligence Chatbot on a Smartphone: Development and Deployment;Lee;J. Med. Internet Res.,2021

5. Survey of conversational agents in health;Montenegro;Expert Syst. Appl.,2019

Cited by 19 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Beyond the Scalpel: Assessing ChatGPT's potential as an auxiliary intelligent virtual assistant in oral surgery;Computational and Structural Biotechnology Journal;2024-12

2. Doctor Versus Artificial Intelligence: Patient and Physician Evaluation of Large Language Model Responses to Rheumatology Patient Questions in a Cross‐Sectional Study;Arthritis & Rheumatology;2024-01-18

3. What Does ChatGPT Know About Dementia? A Comparative Analysis of Information Quality;Journal of Alzheimer's Disease;2024-01-16

4. Assessment of ChatGPT for Nutrition Advice: A comparative analysis across multiple languages (Preprint);2024-01-14

5. A Systematic Review and Meta-Analysis of Artificial Intelligence Tools in Medicine and Healthcare: Applications, Considerations, Limitations, Motivation and Challenges;Diagnostics;2024-01-04