Affiliation:
1. Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR 72204, USA
2. Nonclinical Drug Safety, Boehringer Ingelheim Pharmaceuticals, Inc., Ridgefield, CT 06877, USA
3. FDA/National Center for Toxicological Research, Jefferson, AR 72079, USA
Abstract
Causality assessment is vital in patient safety and pharmacovigilance (PSPV) for safety signal detection, adverse reaction management, and regulatory submission. Large language models (LLMs), especially those designed with transformer architecture, are revolutionizing various fields, including PSPV. While attempts to utilize Bidirectional Encoder Representations from Transformers (BERT)-like LLMs for causal inference in PSPV are underway, a detailed evaluation of “fit-for-purpose” BERT-like model selection to enhance causal inference performance within PSPV applications remains absent. This study conducts an in-depth exploration of BERT-like LLMs, including generic pre-trained BERT LLMs, domain-specific pre-trained LLMs, and domain-specific pre-trained LLMs with safety knowledge-specific fine-tuning, for causal inference in PSPV. Our investigation centers around (1) the influence of data complexity and model architecture, (2) the correlation between the BERT size and its impact, and (3) the role of domain-specific training and fine-tuning on three publicly accessible PSPV data sets. The findings suggest that (1) BERT-like LLMs deliver consistent predictive power across varied data complexity levels, (2) the predictive performance and causal inference results do not directly correspond to the BERT-like model size, and (3) domain-specific pre-trained LLMs, with or without safety knowledge-specific fine-tuning, surpass generic pre-trained BERT models in causal inference. The findings are valuable to guide the future application of LLMs in a broad range of application.
Subject
General Biochemistry, Genetics and Molecular Biology
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献