Various criteria in the evaluation of biomedical named entity recognition-Reference-Cited by-同舟云学术

Various criteria in the evaluation of biomedical named entity recognition

Published:2006-02-24 Issue:1 Volume:7 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Tsai Richard Tzong-Han,Wu Shih-Hung,Chou Wen-Chi,Lin Yu-Chun,He Ding,Hsiang Jieh,Sung Ting-Yi,Hsu Wen-Lian

Abstract

Abstract Background Text mining in the biomedical domain is receiving increasing attention. A key component of this process is named entity recognition (NER). Generally speaking, two annotated corpora, GENIA and GENETAG, are most frequently used for training and testing biomedical named entity recognition (Bio-NER) systems. JNLPBA and BioCreAtIvE are two major Bio-NER tasks using these corpora. Both tasks take different approaches to corpus annotation and use different matching criteria to evaluate system performance. This paper details these differences and describes alternative criteria. We then examine the impact of different criteria and annotation schemes on system performance by retesting systems participated in the above two tasks. Results To analyze the difference between JNLPBA's and BioCreAtIvE's evaluation, we conduct Experiment 1 to evaluate the top four JNLPBA systems using BioCreAtIvE's classification scheme. We then compare them with the top four BioCreAtIvE systems. Among them, three systems participated in both tasks, and each has an F-score lower on JNLPBA than on BioCreAtIvE. In Experiment 2, we apply hypothesis testing and correlation coefficient to find alternatives to BioCreAtIvE's evaluation scheme. It shows that right-match and left-match criteria have no significant difference with BioCreAtIvE. In Experiment 3, we propose a customized relaxed-match criterion that uses right match and merges JNLPBA's five NE classes into two, which achieves an F-score of 81.5%. In Experiment 4, we evaluate a range of five matching criteria from loose to strict on the top JNLPBA system and examine the percentage of false negatives. Our experiment gives the relative change in precision, recall and F-score as matching criteria are relaxed. Conclusion In many applications, biomedical NEs could have several acceptable tags, which might just differ in their left or right boundaries. However, most corpora annotate only one of them. In our experiment, we found that right match and left match can be appropriate alternatives to JNLPBA and BioCreAtIvE's matching criteria. In addition, our relaxed-match criterion demonstrates that users can define their own relaxed criteria that correspond more realistically to their application requirements.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-7-92.pdf

Reference28 articles.

1. Rosario B, Hearst M: Classifying Semantic Relations in Bioscience Text. 2004.

2. Tamames J: Text Detective: BioAlma's gene annotation tool. 2004.

3. Ciaramita M, Gangemi A, Ratsch E, Saric J, Rojas I: Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology. 2005.

4. Chiang JH, Yu HC: MeKE: Discovering the Functions of Gene Products from Biomedical Literature via Sentence Alignment. Bioinformatics 2003, 19(11):1417–1422. 10.1093/bioinformatics/btg160

5. Bioinformatics;JD Kim,2003

Cited by 69 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The impact of ChatGPT on human skills: A quantitative study on twitter data;Technological Forecasting and Social Change;2024-06

2. Future applications of generative large language models: A data-driven case study on ChatGPT;Technovation;2024-05

3. Evaluating Medical Entity Recognition in Healthcare: A Comprehensive Analysis of BERT-Based Models (Preprint);2024-04-23

4. An entity-centric approach to manage court judgments based on Natural Language Processing;Computer Law & Security Review;2024-04

5. Accelerating materials language processing with large language models;Communications Materials;2024-02-15