Affiliation:
1. Department of Instrumentation Engineering, Vishwakarma Institute of Technology, Pune, India
Abstract
Over the last two decades, electronic health records (EHRs) have evolved as a crucial repository for patient health data, encompassing both structured and unstructured information. The objective of EHR is to enhance patient care, and also to serve as tool for reducing costs, managing population health, and supporting clinical research. Natural language processing (NLP) has emerged as a valuable tool for analyzing narrative EHR data, particularly in named entity recognition (NER) tasks. But traditional NLP methodologies encounter challenges to analyze biomedical text due to variations in word distributions. Recent advancements in NLP, specifically bidirectional encoder representations from transformers (BERT), offer promising solutions. BERT utilizes a masked language model base and bidirectional transformer encoder architecture to learn deep contextual representations of words. The work provides an overview of the BERT algorithm, its architecture, and details of and its varaints like BioBERT and ClinicalBERT for various clinical text classification applications.