BACKGROUND
The increasing use of social media to share lived and living experiences of substance use presents a unique opportunity to obtain information on side-effects, usage patterns, and opinions on novel psychoactive substances (NPS). However, due to the large volume of data, obtaining useful insights through natural language processing (NLP) technologies such as large language models (LLMs) is challenging.
OBJECTIVE
To develop a retrieval-augmented generation (RAG) architecture for medical question answering pertaining to clinicians’ queries on emerging issues associated with health-related topics using user-generated medical information on social media.
METHODS
We proposed a two-layer RAG framework for query-focused answer generation and evaluated a proof-of-concept for the framework in the context of query-focused summary generation from social media forums, focusing on emerging drug-related information. We compared the performance of a quantized large language model (LLM), deployable in low-resource settings, with GPT-4.
RESULTS
Our framework achieves comparable median scores in terms of relevance, length, hallucination, coverage, and coherence when evaluated using GPT-4 and Nous-Hermes-2-7B-DPO, evaluated over 20 queries with 52 samples.
CONCLUSIONS
Retrieval augmented generation using LLMs is useful for medical question answering in resource-constrained settings.