Affiliation:
1. University of Alabama at Birmingham
Abstract
Abstract
Understanding the intricacies of genes function within biological systems is paramount for scientific advancement and medical progress. Owing to evolving landscape of this research and the complexity of biological processes, however, this task presents challenges. We introduce PATHAK, a natural language processing (NLP)-based method that mines relationships between genes and their functions from published scientific articles. PATHAK utilizes a pre-trained Transformer language model to generate sentence embeddings from a vast dataset of scientific documents. This enables the identification of meaningful associations between genes and their potential functional annotations. Our approach is adaptable and applicable across diverse scientific domains. Applying PATHAK to over 5,000 research articles focused on Arabidopsis thaliana, we demonstrate its efficacy in elucidating gene function relationships. This method promises to significantly advance our understanding of gene functionality and potentially accelerate discoveries in the context of plant development, growth and stress responses in plants and other systems.
Publisher
Research Square Platform LLC
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献