Abstract
ABSTRACTObjectiveTo develop a transformer-based natural language processing (NLP) system for detecting adverse drug events (ADEs) from clinical notes in electronic health records (EHRs).Materials and MethodsWe fine-tuned BERT Short-Formers and Clinical-Longformer using the processed dataset of the 2018 National NLP Clinical Challenges (n2c2) shared task Track 2. We investigated two data processing methods, window-based and split-based approaches, to find an optimal processing method. We evaluated the generalization capabilities on a dataset extracted from Vanderbilt University Medical Center (VUMC) EHRs.ResultsOn the n2c2 dataset, the best average macro F-scores of 0.832 and 0.868 were achieved using a 15-word window with PubMedBERT and a 10-chunk split with Clinical-Longformer. On the VUMC dataset, the best average macro F-scores of 0.720 and 0.786 were achieved using a 4-chunk split with PubMedBERT and Clinical-Longformer.DiscussionOur study provided a comparative analysis of data processing methods. The fine-tuned transformer models showed good performance for ADE-related tasks. Especially, Clinical-Longformer model with split-based approach had a great potential for practical implementation of ADE detection. While the token limit was crucial, the chunk size also significantly influenced model performance, even when the text length was within the token limit.ConclusionWe provided guidance on model development, including data processing methods for ADE detection from clinical notes using transformer-based models. Our results on two datasets indicated that data processing methods and models should be carefully selected based on the type of clinical notes and the allocation trade-offs of human and computational power in annotation and model fine-tuning.
Publisher
Cold Spring Harbor Laboratory
Reference34 articles.
1. Adverse Drug Events and Contributing Factors Among Hospitalized Adult Patients at Jimma Medical Center, Southwest Ethiopia: A Prospective Observational Study;Curr Ther Res,2020
2. Rho JP;Counting the Costs of Drug-Related Adverse Events: PharmacoEconomics,1999
3. Extracting Adverse Drug Events from Clinical Notes;AMIA Jt Summits Transl Sci Proc AMIA Jt Summits Transl Sci,2021