The overview of the BioRED (Biomedical Relation Extraction Dataset) track at BioCreative VIII

Author:

Islamaj Rezarta1ORCID,Lai Po-Ting1,Wei Chih-Hsuan1ORCID,Luo Ling2ORCID,Almeida Tiago3ORCID,Jonker Richard A A3,Conceição Sofia I R4,Sousa Diana F4,Phan Cong-Phuoc5,Chiang Jung-Hsien5,Li Jiru2ORCID,Pan Dinghao2,Meesawad Wilailack6,Tsai Richard Tzong-Han67,Sarol M Janina8ORCID,Hong Gibong8,Valiev Airat9,Tutubalina Elena910ORCID,Lee Shao-Man11,Hsu Yi-Yu11ORCID,Li Mingjie12,Verspoor Karin12ORCID,Lu Zhiyong1ORCID

Affiliation:

1. National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20894, United States

2. School of Computer Science and Technology, Dalian University of Technology , No. 2 Linggong Road, Ganjingzi District, Dalian 116024, China

3. Department of Electronics, Telecommunications and Informatics (DETI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro , Campus Universitário de Santiago, Aveiro 3810-193, Portugal

4. Departamento de Informática, Faculdade de Ciências da Universidade de Lisboa , Edifício C6 Campo Grande, Lisbon 1749-016, Portugal

5. Department of Computer Science and Information Engineering, National Cheng Kung University , No.1, University Road, Tainan City 701, Taiwan, Republic of China

6. Department of Computer Science and Information Engineering, National Central University , No. 300, Zhongda Rd., Zhongli District, Taoyuan City 32001, Taiwan, Republic of China

7. Research Center for Humanities and Social Sciences, Academia Sinica , No. 128, Section 2, Academia Rd., Nangang District, Taoyuan City 115201, Taiwan, Republic of China

8. School of Information Sciences, University of Illinois at Urbana-Champaign , 614 E. Daniel St, Champaign, IL 61820, United States

9. Higher School of Economics University , 20 Myasnitskaya st, Moscow 101000, Russia

10. Kazan Federal University , 18 Kremlevskaya st, Kazan, Russia 420008, Russia

11. Miin Wu School of Computing, National Cheng Kung University , No. 1, University Road, Tainan 701, Taiwan, Republic of China

12. School of Computing Technologies, RMIT University , 124 La Trobe Street, Melbourne, Victoria 3000, Australia

Abstract

Abstract The BioRED track at BioCreative VIII calls for a community effort to identify, semantically categorize, and highlight the novelty factor of the relationships between biomedical entities in unstructured text. Relation extraction is crucial for many biomedical natural language processing (NLP) applications, from drug discovery to custom medical solutions. The BioRED track simulates a real-world application of biomedical relationship extraction, and as such, considers multiple biomedical entity types, normalized to their specific corresponding database identifiers, as well as defines relationships between them in the documents. The challenge consisted of two subtasks: (i) in Subtask 1, participants were given the article text and human expert annotated entities, and were asked to extract the relation pairs, identify their semantic type and the novelty factor, and (ii) in Subtask 2, participants were given only the article text, and were asked to build an end-to-end system that could identify and categorize the relationships and their novelty. We received a total of 94 submissions from 14 teams worldwide. The highest F-score performances achieved for the Subtask 1 were: 77.17% for relation pair identification, 58.95% for relation type identification, 59.22% for novelty identification, and 44.55% when evaluating all of the above aspects of the comprehensive relation extraction. The highest F-score performances achieved for the Subtask 2 were: 55.84% for relation pair, 43.03% for relation type, 42.74% for novelty, and 32.75% for comprehensive relation extraction. The entire BioRED track dataset and other challenge materials are available at https://ftp.ncbi.nlm.nih.gov/pub/lu/BC8-BioRED-track/ and https://codalab.lisn.upsaclay.fr/competitions/13377 and https://codalab.lisn.upsaclay.fr/competitions/13378. Database URL: https://ftp.ncbi.nlm.nih.gov/pub/lu/BC8-BioRED-track/https://codalab.lisn.upsaclay.fr/competitions/13377https://codalab.lisn.upsaclay.fr/competitions/13378

Funder

Research Unit

Russian Science Foundation

Kazan Federal University

LASIGE Computer Science and Engineering Research Centre

Fundamental Research Funds for the Central Universities

Fct

Fundação para a Ciência e a Tecnologia

the NIH Intramural Research Program, National Library of Medicine

FCT

Publisher

Oxford University Press (OUP)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3