Abstract
The clinical application of detecting COVID-19 factors is a challenging task. The existing named entity recognition models are usually trained on a limited set of named entities. Besides clinical, the non-clinical factors, such as social determinant of health (SDoH), are also important to study the infectious disease. In this paper, we propose a generalizable machine learning approach that improves on previous efforts by recognizing a large number of clinical risk factors and SDoH. The novelty of the proposed method lies in the subtle combination of a number of deep neural networks, including the BiLSTM-CNN-CRF method and a transformer-based embedding layer. Experimental results on a cohort of COVID-19 data prepared from PubMed articles show the superiority of the proposed approach. When compared to other methods, the proposed approach achieves a performance gain of about 1–5% in terms of macro- and micro-average F1 scores. Clinical practitioners and researchers can use this approach to obtain accurate information regarding clinical risks and SDoH factors, and use this pipeline as a tool to end the pandemic or to prepare for future pandemics.
Subject
Virology,Infectious Diseases
Reference60 articles.
1. Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing;Chen;Annu. Rev. Biomed. Data Sci.,2021
2. Raza, S., Schwartz, B., and Rosella, L.C. (2022). CoQUAD: A COVID-19 Question Answering Dataset System, Facilitating Research, Benchmarking, and Practice. BMC Bioinform., 23.
3. Allen Institute (2022, November 27). COVID-19 Open Research Dataset Challenge (CORD-19). Available online: https://www.kaggle.com/datasets/allen-institute-for-ai/CORD-19-research-challenge.
4. LitCovid: An Open Database of COVID-19 Literature;Chen;Nucleic Acids Res.,2021
5. Text Mining Approaches for Dealing with the Rapidly Expanding Literature on COVID-19;Wang;Brief. Bioinform.,2021
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献