Abstract
ABSTRACTDeep learning has significantly advanced the development of high-performance methods for protein function prediction. Nonetheless, even for state-of-the-art deep learning approaches, template information remains an indispensable component in most cases. While many function prediction methods use templates identified through sequence homology or protein-protein interactions, very few methods detect templates through structural similarity, even though protein structures are the basis of their functions. Here, we describe our development of StarFunc, a composite approach that integrates state-of-the-art deep learning models seamlessly with template information from sequence homology, protein-protein interaction partners, proteins with similar structures, and protein domain families. Large-scale benchmarking and blind testing in the 5thCritical Assessment of Function Annotation (CAFA5) consistently demonstrate StarFunc’s advantage when compared to both state-of-the-art deep learning methods and conventional template-based predictors.
Publisher
Cold Spring Harbor Laboratory