Assessing the readability, reliability, and quality of artificial intelligence chatbot responses to the 100 most searched queries about cardiopulmonary resuscitation: An observational study

Author:

Ömür Arça Dilek1ORCID,Erdemir İsmail1,Kara Fevzi1,Shermatov Nurgazy1,Odacioğlu Mürüvvet1,İbişoğlu Emel1,Hanci Ferid Baran2,Sağiroğlu Gönül1,Hanci Volkan1

Affiliation:

1. Department of Anesthesiology and Reanimation, School of Medicine, Dokuz Eylul University, Izmir, Turkey

2. Departments of Faculty of Engineering, Ostim Technical University, Artificial Intelligence Engineering, Ankara, Turkey.

Abstract

This study aimed to evaluate the readability, reliability, and quality of responses by 4 selected artificial intelligence (AI)-based large language model (LLM) chatbots to questions related to cardiopulmonary resuscitation (CPR). This was a cross-sectional study. Responses to the 100 most frequently asked questions about CPR by 4 selected chatbots (ChatGPT-3.5 [Open AI], Google Bard [Google AI], Google Gemini [Google AI], and Perplexity [Perplexity AI]) were analyzed for readability, reliability, and quality. The chatbots were asked the following question: “What are the 100 most frequently asked questions about cardio pulmonary resuscitation?” in English. Each of the 100 queries derived from the responses was individually posed to the 4 chatbots. The 400 responses or patient education materials (PEM) from the chatbots were assessed for quality and reliability using the modified DISCERN Questionnaire, Journal of the American Medical Association and Global Quality Score. Readability assessment utilized 2 different calculators, which computed readability scores independently using metrics such as Flesch Reading Ease Score, Flesch-Kincaid Grade Level, Simple Measure of Gobbledygook, Gunning Fog Readability and Automated Readability Index. Analyzed 100 responses from each of the 4 chatbots. When the readability values of the median results obtained from Calculators 1 and 2 were compared with the 6th-grade reading level, there was a highly significant difference between the groups (P < .001). Compared to all formulas, the readability level of the responses was above 6th grade. It can be seen that the order of readability from easy to difficult is Bard, Perplexity, Gemini, and ChatGPT-3.5. The readability of the text content provided by all 4 chatbots was found to be above the 6th-grade level. We believe that enhancing the quality, reliability, and readability of PEMs will lead to easier understanding by readers and more accurate performance of CPR. So, patients who receive bystander CPR may experience an increased likelihood of survival.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Reference46 articles.

1. Odds of talking to healthcare providers as the initial source of healthcare information: updated cross-sectional results from the Health Information National Trends Survey (HINTS).;Swoboda;BMC Fam Pract,2018

2. Evaluating the effectiveness of artificial intelligence-powered large language models in disseminating appropriate and readable health information in urology.;Davis;J Urol,2023

3. Striking jumps in consumers seeking healthcare information.;Tu;Track Rep,2008

4. Family medicine patients’ use of the Internet for health information: a metronet study.;Schwartz;J Am Board Fam Med,2006

5. What Can ChatGPT Do? Analyzing early reactions to the innovative AI Chatbot on Twitter.;Taecharungroj;Big Data Cogn Comput,2023

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3