Affiliation:
1. IEETA/DETI, LASI, University of Aveiro , Campus Universitário de Santiago, Aveiro 3810-193, Portugal
Abstract
Abstract
Biomedical relation extraction is an ongoing challenge within the natural language processing community. Its application is important for understanding scientific biomedical literature, with many use cases, such as drug discovery, precision medicine, disease diagnosis, treatment optimization and biomedical knowledge graph construction. Therefore, the development of a tool capable of effectively addressing this task holds the potential to improve knowledge discovery by automating the extraction of relations from research manuscripts. The first track in the BioCreative VIII competition extended the scope of this challenge by introducing the detection of novel relations within the literature. This paper describes that our participation system initially focused on jointly extracting and classifying novel relations between biomedical entities. We then describe our subsequent advancement to an end-to-end model. Specifically, we enhanced our initial system by incorporating it into a cascading pipeline that includes a tagger and linker module. This integration enables the comprehensive extraction of relations and classification of their novelty directly from raw text. Our experiments yielded promising results, and our tagger module managed to attain state-of-the-art named entity recognition performance, with a micro F1-score of 90.24, while our end-to-end system achieved a competitive novelty F1-score of 24.59. The code to run our system is publicly available at https://github.com/ieeta-pt/BioNExt.
Database URL: https://github.com/ieeta-pt/BioNExt
Publisher
Oxford University Press (OUP)
Reference79 articles.
1. Global normalization of convolutional neural networks for joint entity and relation classification;Adel,2017
2. Chemical identification and indexing in PubMed full-text articles using deep learning and heuristics;Almeida;Database,2022
3. BIT.UA at Biocreative VIII track 1: a joint model for relation classification and novelty detection;Almeida,2023
4. BIT.UA at BioASQ 11B: two-stage IR with synthetic training and zero-shot answer generation;Almeida,2023
5. The cellosaurus, a cell-line knowledge resource;Bairoch;J. Biomol. Tech. JBT,2018