Affiliation:
1. Department of Computational Medicine and Bioinformatics, MI 48109, USA
2. Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
Abstract
Abstract
Motivation
The accuracy and success rate of de novo protein design remain limited, mainly due to the parameter over-fitting of current energy functions and their inability to discriminate incorrect designs from correct designs.
Results
We developed an extended energy function, EvoEF2, for efficient de novo protein sequence design, based on a previously proposed physical energy function, EvoEF. Remarkably, EvoEF2 recovered 32.5%, 47.9% and 22.3% of all, core and surface residues for 148 test monomers, and was generally applicable to protein–protein interaction design, as it recapitulated 30.9%, 42.4%, 31.3% and 21.4% of all, core, interface and surface residues for 88 test dimers, significantly outperforming EvoEF on the native sequence recapitulation. We further used I-TASSER to evaluate the foldability of the 148 designed monomer sequences, where all of them were predicted to fold into structures with high fold- and atomic-level similarity to their corresponding native structures, as demonstrated by the fact that 87.8% of the predicted structures shared a root-mean-square-deviation less than 2 Å to their native counterparts. The study also demonstrated that the usefulness of physical energy functions is highly correlated with the parameter optimization processes, and EvoEF2, with parameters optimized using sequence recapitulation, is more suitable for computational protein sequence design than EvoEF, which was optimized on thermodynamic mutation data.
Availability and implementation
The source code of EvoEF2 and the benchmark datasets are freely available at https://zhanglab.ccmb.med.umich.edu/EvoEF.
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
National Institute of General Medical Sciences
National Institute of Allergy and Infectious Diseases
the National Science Foundation
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Cited by
82 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献