GePI: Retrieval of fully automated recognition and extraction of gene and protein interaction mentions from unstructured literature

Author:

Faessler ErikORCID,Hahn Udo,Schäuble SaschaORCID

Abstract

AbstractMotivationKnowledge about interactions between genes and proteins is vital for bio-molecular research. A large part of this knowledge is published in written text and not accessible in a structured way. To remedy this situation, several repositories of automatically extracted interaction facts were proposed over the years. However, existing solutions lack key features such as permanently updated data resources, easy accessibility and structured result generation ready to be used for downstream analyses.ResultsWe propose GePI, a database portal for fully automated extraction and presentation of molecular interaction facts from scientific literature. GePI offers batch queries, immediate inspection of textual evidence and full text filters. To this end, GePI leverages two gene recognition and normalization approaches as well as optimized runtime for molecular event extraction. The resulting natural language processing pipeline is applied to the full set of publicly available documents from PubMed and the PubMed Central open access subset accounting for more than 33M abstracts and 4.2M complete articles as of 2022. To accommodate the rapid growth of the scientific literature, the fact database is automatically updated several times per week. In summary, our web application GePI allows for the first time a free and easy-to-use investigation of gene and protein interaction information as soon as they are published with unique query possibilities.Availability and ImplementationThe GePI web interface is available at http://gepi.coling.uni-jena.de.Contacterik.faessler@uni-jena.de

Publisher

Cold Spring Harbor Laboratory

Reference24 articles.

1. Database resources of the National Center for Biotechnology Information

2. FamPlex: a resource for entity recognition and relationship resolution of human protein families and complexes in biomedical text mining

3. A robust approach to extract biomedical events from literature

4. Bui, Q.-C. et al. (2013). A Fast Rule-based Approach for Biomedical Event Extraction. In Proceedings of the BioNLP 2013 Shared Task Workshop, pages 104–108, Sofia, Bulgaria. Association for Computational Linguistics.

5. Bioc: a minimalist approach to interoperability for biomedical text processing;Database –The Journal of Biological Databases and Curation,2013

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3