Assessing Accuracy of ChatGPT on Addressing Helicobacter pylori Infection‐Related Questions: A National Survey and Comparative Study

Author:

Hu Yi12,Lai Yongkang134ORCID,Liao Foqiang1,Shu Xu1ORCID,Zhu Yin1,Du Yi‐Qi3ORCID,Lu Nong‐Hua1ORCID,

Affiliation:

1. Department of Gastroenterology, Jiangxi Medical College, The First Affiliated Hospital, Digestive Disease Hospital Nanchang University Nanchang Jiangxi China

2. Department of Surgery The Chinese University of Hong Kong Hong Kong China

3. Department of Gastroenterology, Changhai Hospital Naval Medical University Shanghai China

4. Department of Gastroenterology, Jiangxi Medical College, Ganzhou People's Hospital Nanchang University Nanchang China

Abstract

ABSTRACTBackgroundChatGPT is a novel and online large‐scale language model used as a source providing up‐to‐date and useful health‐related knowledges to patients and clinicians. However, its performance on Helicobacter pylori infection‐related questions remain unknown. This study aimed to evaluate the accuracy of ChatGPT's responses on H. pylori‐related questions compared with that of gastroenterologists during the same period.MethodsTwenty‐five H. pylori‐related questions from five domains: Indication, Diagnostics, Treatment, Gastric cancer and prevention, and Gut Microbiota were selected based on the Maastricht VI Consensus report. Each question was tested three times with ChatGPT3.5 and ChatGPT4. Two independent H. pylori experts assessed the responses from ChatGPT, with discrepancies resolved by a third reviewer. Simultaneously, a nationwide survey with the same questions was conducted among 1279 gastroenterologists and 154 medical students. The accuracy of responses from ChatGPT3.5 and ChatGPT4 was compared with that of gastroenterologists.ResultsOverall, both ChatGPT3.5 and ChatGPT4 demonstrated high accuracy, with median accuracy rates of 92% for each of the three responses, surpassing the accuracy of nationwide gastroenterologists (median: 80%) and equivalent to that of senior gastroenterologists. Compared with ChatGPT3.5, ChatGPT4 provided more concise responses with the same accuracy. ChatGPT3.5 performed well in the Indication, Treatment, and Gut Microbiota domains, whereas ChatGPT4 excelled in Diagnostics, Gastric cancer and prevention, and Gut Microbiota domains.ConclusionChatGPT exhibited high accuracy and reproducibility in addressing H. pylori‐related questions except the decision for H. pylori treatment, performing at the level of senior gastroenterologists and could serve as an auxiliary information tool for assisting patients and clinicians.

Funder

National Natural Science Foundation of China

Key Research and Development Program of Jiangxi Province

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3