Clinical concept recognition: Evaluation of existing systems on EHRs-Reference-Cited by-同舟云学术

Clinical concept recognition: Evaluation of existing systems on EHRs

Published:2023-01-13 Issue: Volume:5 Page:
ISSN:2624-8212
Container-title:Frontiers in Artificial Intelligence
language:
Short-container-title:Front. Artif. Intell.

Author:

Lossio-Ventura Juan Antonio,Sun Ran,Boussard Sebastien,Hernandez-Boussard Tina

Abstract

ObjectiveThe adoption of electronic health records (EHRs) has produced enormous amounts of data, creating research opportunities in clinical data sciences. Several concept recognition systems have been developed to facilitate clinical information extraction from these data. While studies exist that compare the performance of many concept recognition systems, they are typically developed internally and may be biased due to different internal implementations, parameters used, and limited number of systems included in the evaluations. The goal of this research is to evaluate the performance of existing systems to retrieve relevant clinical concepts from EHRs.MethodsWe investigated six concept recognition systems, including CLAMP, cTAKES, MetaMap, NCBO Annotator, QuickUMLS, and ScispaCy. Clinical concepts extracted included procedures, disorders, medications, and anatomical location. The system performance was evaluated on two datasets: the 2010 i2b2 and the MIMIC-III. Additionally, we assessed the performance of these systems in five challenging situations, including negation, severity, abbreviation, ambiguity, and misspelling.ResultsFor clinical concept extraction, CLAMP achieved the best performance on exact and inexact matching, with an F-score of 0.70 and 0.94, respectively, on i2b2; and 0.39 and 0.50, respectively, on MIMIC-III. Across the five challenging situations, ScispaCy excelled in extracting abbreviation information (F-score: 0.86) followed by NCBO Annotator (F-score: 0.79). CLAMP outperformed in extracting severity terms (F-score 0.73) followed by NCBO Annotator (F-score: 0.68). CLAMP outperformed other systems in extracting negated concepts (F-score 0.63).ConclusionsSeveral concept recognition systems exist to extract clinical information from unstructured data. This study provides an external evaluation by end-users of six commonly used systems across different extraction tasks. Our findings suggest that CLAMP provides the most comprehensive set of annotations for clinical concept extraction tasks and associated challenges. Comparing standard extraction tasks across systems provides guidance to other clinical researchers when selecting a concept recognition system relevant to their clinical information extraction task.

Funder

National Cancer Institute

Publisher

Frontiers Media SA

Subject

Artificial Intelligence

Reference63 articles.

1. Clinical Text Analysis Knowledge Extraction System2021

2. An overview of MetaMap: historical perspective and recent advances;Aronson;J. Am. Med. Inform. Assoc.,2010

3. The revival of the notes field: leveraging the unstructured content in electronic health records;Assale;Front. Med,2019

4. Advances in electronic phenotyping: from rule-based definitions to machine learning models;Banda;Annu. Rev. Biomed. Data Sci,2018

5. The Unified Medical Language System (UMLS): integrating biomedical terminology;Bodenreider;Nucleic Acids Res,2004

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Leveraging GPT-4 for identifying cancer phenotypes in electronic health records: a performance comparison between GPT-4, GPT-3.5-turbo, Flan-T5, Llama-3-8B, and spaCy’s rule-based and machine learning-based methods;JAMIA Open;2024-07-01

2. Early Risk Prediction of Depression Based on Social Media Posts in Arabic;2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI);2023-11-06

3. Leveraging GPT-4 for Identifying Clinical Phenotypes in Electronic Health Records: A Performance Comparison between GPT-4, GPT-3.5-turbo and spaCy’s Rule-based & Machine Learning-based methods;2023-09-29