Abstract
AbstractThe Gene Ontology (GO) is one of the most successful ontologies in the biological domain. GO is a formal theory with over 100,000 axioms that describe the molecular functions, biological processes, and cellular locations of proteins in three sub-ontologies. Many methods have been developed to automatically predict protein functions. However, only few of them use the background knowledge provided in the axioms of GO for knowledge-enhanced machine learning, or adjust and evaluate the model for the differences between the sub-ontologies.We have developed DeepGO-SE, a novel method which predicts GO functions from protein sequences using a pretrained large language model combined with a neuro-symbolic model that exploits GO axioms and performs protein function prediction as a form of approximate semantic entailment. We specifically evaluate DeepGO-SE on proteins that have no significant similarity with training proteins and demonstrate that DeepGO-SE can improve function prediction for those proteins.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献