Using ChatGPT to Improve Patient Educational Materials: a False Promise or a Promising Tool? (Preprint)

Author:

Maze Karleigh,Smith Burkely P.,Gulley Kristapher,Hodges Shelby C.,Abdulla Abiha,Giri Oviya,Brock Bethany,Wadhwani Nikita,Jones Bayley,Wood Lauren,Thirumalai MohanrajORCID,Rubyan Michael,Chu Daniel I.

Abstract

BACKGROUND

Patients’ inability to understand educational materials may explain non-adherence and poor outcomes among those with low health literacy. ChatGPT and similar artificial intelligences (AIs) may serve as accessible tools to improve readability of documents and reduce this gap.

OBJECTIVE

The aim of this study was to investigate ChatGPT’s ability to improve readability of patient education materials.

METHODS

Explanatory case study using three commands in ChatGPT to generate text (“Please reproduce this text at a 6th grade reading level” (A), “Please summarize and simplify the text” (B), and “Make this document more health literate” (C)). ChatGPT version 4.0 was used. Flesch Kincaid Grade Level (FKGL), Flesch Kincaid Reading Ease (FKRE), and SMOG scores were calculated via an online calculator for original and ChatGPT-revised documents. Unpaired T-tests compared mean scores between original and ChatGPT-revised documents.

RESULTS

Education materials were gathered from a large, tertiary-referral academic institution (N=63) and five rural hospitals (N=90; Demopolis, Greenville, Selma, Montgomery, and Shelby). Documents from the academic institution included ostomy (n=15), pre-operative (n=20), and post-operative (n=28) material. Compared to original documents, ChatGPT-revised documents using Prompts B and C scored worse in FKGL, FKRE, and SMOG scores (p<0.01). Documents generated using Prompt A produced a higher FKGL (p<0.01), but no change in FKRE or SMOG scores. Documents from rural hospitals included education related to colonoscopy (n=15), diabetes care (n=18), cardiovascular health (n=12), general health (n=10), surgical care (n=29), and other screening and prevention (n=6). Compared to original documents, ChatGPT documents using Prompt A improved FKRE (p<0.01) and SMOG scores (p=0.01), Prompt B was associated with worse FKGL (p<0.01), and Prompt C was associated with worse FKGL (p<0.01) and FKRE (p<0.01).

CONCLUSIONS

ChatGPT does not consistently improve the readability of patient educational materials. Further work is needed to optimize text-based AI’s to be used for this purpose.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3