Abstract
In this paper, a tagging tool is developed to streamline the process of locating tags for each term and manually selecting the target term. It directly extracts the terms to be tagged from sentences and displays it to the user. It also increases tagging efficiency by allowing users to reflect candidate categories in untagged terms. It is based on annotations automatically generated using machine learning. Subsequently, this architecture is fine-tuned using Bidirectional Encoder Representations from Transformers (BERT) to enable the tagging of terms that cannot be captured using Named-Entity Recognition (NER). The tagged text data extracted using the proposed tagging tool can be used as an additional training dataset. The tagging tool, which receives and saves new NE annotation input online, is added to the NER and RE web interfaces using BERT. Annotation information downloaded by the user includes the category (e.g., diseases, genes/proteins) and the list of words associated to the named entity selected by the user. The results reveal that the RE and NER results are improved using the proposed web service by collecting more NE annotation data and fine-tuning the model using generated datasets. Our application programming interfaces and demonstrations are available to the public at via the website link provided in this paper.
Funder
National Research Foundation of Korea
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference36 articles.
1. Ayush, S., Simmons, M., and Lu, Z. (2016). Text mining genotype-phenotype relationships from biomedical literature for database curation and precision medicine. PLoS Comput. Biol., 12.
2. On expert curation and scalability: UniProtKB/Swiss-Prot as a case study;Bioinformatics,2017
3. Text-mining-assisted biocuration workflows in Argo;Database,2014
4. Assisting manual literature curation for protein–protein interactions using BioQRator;Database,2014
5. Egas: A collaborative and interactive document curation platform;Database,2014
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A lightweight biomedical named entity recognition with pre-trained model;2023 IEEE 3rd International Conference on Data Science and Computer Application (ICDSCA);2023-10-27
2. BERT Fine-Tuning the Covid-19 Open Research Dataset for Named Entity Recognition;Communications in Computer and Information Science;2023