All You Need Is Context: Clinician Evaluations of various iterations of a Large Language Model-Based First Aid Decision Support Tool in Ghana-Reference-Cited by-同舟云学术

All You Need Is Context: Clinician Evaluations of various iterations of a Large Language Model-Based First Aid Decision Support Tool in Ghana

Published:2024-04-04 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Mensah Paulina Boadiwaa^ORCID,Quao Nana Serwaa^ORCID,Dagadu Sesinam,Mensah James Kwabena,Darkwah Jude Domfeh,

Abstract

AbstractAs advancements in research and development expand the capabilities of Large Language Models (LLMs), there is a growing focus on their applications within the healthcare sector, driven by the large volume of data generated in healthcare. There are a few medicine-oriented evaluation datasets and benchmarks for assessing the performance of various LLMs in clinical scenarios; however, there is a paucity of information on the real-world usefulness of LLMs in context-specific scenarios in resource-constrained settings. In this work, 5 iterations of a decision support tool for medical emergencies using 5 distinct generalized LLMs were constructed, alongside a combination of Prompt Engineering and Retrieval Augmented Generation techniques. 50 responses were generated from the LLMs. Quantitative and qualitative evaluations of the LLM responses were provided by 13 physicians (general practitioners) with an average of 3 years of practice experience managing medical emergencies in resource-constrained settings in Ghana. Machine evaluations of the LLM responses were also computed and compared with the expert evaluations.

Publisher

Cold Spring Harbor Laboratory

Reference22 articles.

1. Project Genie Clinician Evaluation Group (March, 2024) https://bit.ly/clinician-evaluators-project-genie

2. Zhou, H. , Gu, B. , Zou, X. , Li, Y. , Chen, S.S. , et al. (2023). A Survey of Large Language Models in Medicine: Progress, Application, and Challenge. ArXiv, abs/2311.05112.

3. SnooCODE Red Team, “CONTEXT MATTERS: DIFFERENCES IN AI FIRST AID ASSISTANT OUTPUTS IN VARIOUS CONTEXTS.” https://bit.ly/snoocodered-context-matters

4. On the cusp: Considering the impact of artificial intelligence language models in healthcare

5. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Can Large Language Models Provide Emergency Medical Help Where There Is No Ambulance? A Comparative Study on Large Language Model Understanding of Emergency Medical Scenarios in Resource-Constrained Settings;2024-04-19