Abstract
Background
The use of generative artificial intelligence, more specifically large language models (LLMs), is proliferating, and as such, it is vital to consider both the value and potential harms of its use in medical education. Their efficiency in a variety of writing styles makes LLMs, such as ChatGPT, attractive for tailoring educational materials. However, this technology can feature biases and misinformation, which can be particularly harmful in medical education settings, such as mental health and substance use education. This viewpoint investigates if ChatGPT is sufficient for 2 common health education functions in the field of mental health and substance use: (1) answering users’ direct queries and (2) aiding in the development of quality consumer educational health materials.
Objective
This viewpoint includes a case study to provide insight into the accessibility, biases, and quality of ChatGPT’s query responses and educational health materials. We aim to provide guidance for the general public and health educators wishing to utilize LLMs.
Methods
We collected real world queries from 2 large-scale mental health and substance use portals and engineered a variety of prompts to use on GPT-4 Pro with the Bing BETA internet browsing plug-in. The outputs were evaluated with tools from the Sydney Health Literacy Lab to determine the accessibility, the adherence to Mindframe communication guidelines to identify biases, and author assessments on quality, including tailoring to audiences, duty of care disclaimers, and evidence-based internet references.
Results
GPT-4’s outputs had good face validity, but upon detailed analysis were substandard in comparison to expert-developed materials. Without engineered prompting, the reading level, adherence to communication guidelines, and use of evidence-based websites were poor. Therefore, all outputs still required cautious human editing and oversight.
Conclusions
GPT-4 is currently not reliable enough for direct-consumer queries, but educators and researchers can use it for creating educational materials with caution. Materials created with LLMs should disclose the use of generative artificial intelligence and be evaluated on their efficacy with the target audience.
Reference40 articles.
1. openai.comSimilarweb2023-11-15https://www.similarweb.com/website/openai.com/#overview
2. GPT-4 is OpenAI's most advanced system, producing safer and more useful responsesOpenAI2023-11-15https://openai.com/gpt-4
3. The performance of ChatGPT in generating answers to clinical questions in psychiatry: a two‐layer assessment
4. KnightWChatGPT?s Most Charming Trick Is Also Its Biggest FlawWired2023-11-15https://tinyurl.com/yc69e79j
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献