Affiliation:
1. College of Computer Science and Software Engineering, Shenzhen University , Shenzhen 518060, China
2. GeneGenieDx Corp , San Jose, CA 95134, USA
Abstract
AbstractMotivationNatural language processing (NLP) tasks aim to convert unstructured text data (e.g. articles or dialogues) to structured information. In recent years, we have witnessed fundamental advances of NLP technique, which has been widely used in many applications such as financial text mining, news recommendation and machine translation. However, its application in the biomedical space remains challenging due to a lack of labeled data, ambiguities and inconsistencies of biological terminology. In biomedical marker discovery studies, tools that rely on NLP models to automatically and accurately extract relations of biomedical entities are valuable as they can provide a more thorough survey of all available literature, hence providing a less biased result compared to manual curation. In addition, the fast speed of machine reader helps quickly orient research and development.ResultsTo address the aforementioned needs, we developed automatic training data labeling, rule-based biological terminology cleaning and a more accurate NLP model for binary associative and multi-relation prediction into the MarkerGenie program. We demonstrated the effectiveness of the proposed methods in identifying relations between biomedical entities on various benchmark datasets and case studies.Availability and implementationMarkerGenie is available at https://www.genegeniedx.com/markergenie/. Data for model training and evaluation, term lists of biomedical entities, details of the case studies and all trained models are provided at https://drive.google.com/drive/folders/14RypiIfIr3W_K-mNIAx9BNtObHSZoAyn?usp=sharing.Supplementary informationSupplementary data are available at Bioinformatics Advances online.
Funder
National Key Research and Development Project
National Natural Science Foundation of China
Guangdong Provincial Key Laboratory
Shenzhen Fundamental Research Program
BGIShenzhen
Publisher
Oxford University Press (OUP)
Subject
Cell Biology,Developmental Biology,Embryology,Anatomy
Reference46 articles.
1. Association of the microbiome with colorectal cancer development;Abdulla;Int. J. Oncol,2021
2. The prevalence of human papillomavirus in colorectal cancer and adenoma: a Meta-analysis;Chao;J. Cancer Res. Ther,2020
3. Exploring the role of gut microbiome in Colon cancer;Chattopadhyay;Appl. Biochem. Biotechnol,2021
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献