Integration of pre-trained protein language models into geometric deep learning networks-Reference-Cited by-同舟云学术

Integration of pre-trained protein language models into geometric deep learning networks

Published:2023-08-25 Issue:1 Volume:6 Page:
ISSN:2399-3642
Container-title:Communications Biology
language:en
Short-container-title:Commun Biol

Author:

Wu Fang^ORCID,Wu Lirong,Radev Dragomir,Xu Jinbo^ORCID,Li Stan Z.

Abstract

AbstractGeometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area. However, its efficacy is largely constrained due to the limited quantity of structural data. Meanwhile, protein language models trained on substantial 1D sequences have shown burgeoning capabilities with scale in a broad range of applications. Several preceding studies consider combining these different protein modalities to promote the representation power of geometric neural networks but fail to present a comprehensive understanding of their benefits. In this work, we integrate the knowledge learned by well-trained protein language models into several state-of-the-art geometric networks and evaluate a variety of protein representation learning benchmarks, including protein-protein interface prediction, model quality assessment, protein-protein rigid-body docking, and binding affinity prediction. Our findings show an overall improvement of 20% over baselines. Strong evidence indicates that the incorporation of protein language models’ knowledge enhances geometric networks’ capacity by a significant margin and can be generalized to complex tasks.

Publisher

Springer Science and Business Media LLC

Subject

General Agricultural and Biological Sciences,General Biochemistry, Genetics and Molecular Biology,Medicine (miscellaneous)

Link

https://www.nature.com/articles/s42003-023-05133-1.pdf

Reference70 articles.

1. Xu, M. et al. Geodiff: a geometric diffusion model for molecular conformation generation. In International Conference on Learning Representations (ICLR, 2022).

2. Townshend, R. J. et al. Atom3d: tasks on molecules in three dimensions. 35th Conference on Neural Information Processing Systems (NeurIPS 2021).

3. Wu, Z. et al. Moleculenet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2018).

4. Lim, J. et al. Predicting drug–target interaction using a novel graph neural network with 3d structure-embedded graph representation. J Chem. Inf. Model. 59, 3981–3988 (2019).

5. Liu, Y., Yuan, H., Cai, L. & Ji, S. Deep learning of high-order interactions for protein interface prediction. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, 679–687 (ACM, 2020).

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Evaluating the 3D structure prediction tools to identify optimal MEBPVC structure models;Computational and Structural Biotechnology Reports;2024-12

2. Unveiling the evolution of policies for enhancing protein structure predictions: A comprehensive analysis;Computers in Biology and Medicine;2024-09

3. Machine-learning-based structural analysis of interactions between antibodies and antigens;BioSystems;2024-09

4. Understanding and Therapeutic Application of Immune Response in Major Histocompatibility Complex (MHC) Diversity Using Multimodal Artificial Intelligence;BioMedInformatics;2024-08-05

5. Pairing interacting protein sequences using masked language modeling;Proceedings of the National Academy of Sciences;2024-06-24