Author:
Si Dong,Moritz Spencer A.,Pfab Jonas,Hou Jie,Cao Renzhi,Wang Liguo,Wu Tianqi,Cheng Jianlin
Abstract
AbstractCryo-electron microscopy (cryo-EM) has become a leading technology for determining protein structures. Recent advances in this field have allowed for atomic resolution. However, predicting the backbone trace of a protein has remained a challenge on all but the most pristine density maps (<2.5 Å resolution). Here we introduce a deep learning model that uses a set of cascaded convolutional neural networks (CNNs) to predict Cα atoms along a protein’s backbone structure. The cascaded-CNN (C-CNN) is a novel deep learning architecture comprised of multiple CNNs, each predicting a specific aspect of a protein’s structure. This model predicts secondary structure elements (SSEs), backbone structure, and Cα atoms, combining the results of each to produce a complete prediction map. The cascaded-CNN is a semantic segmentation image classifier and was trained using thousands of simulated density maps. This method is largely automatic and only requires a recommended threshold value for each protein density map. A specialized tabu-search path walking algorithm was used to produce an initial backbone trace with Cα placements. A helix-refinement algorithm made further improvements to the α-helix SSEs of the backbone trace. Finally, a novel quality assessment-based combinatorial algorithm was used to effectively map protein sequences onto Cα traces to obtain full-atom protein structures. This method was tested on 50 experimental maps between 2.6 Å and 4.4 Å resolution. It outperformed several state-of-the-art prediction methods including Rosetta de-novo, MAINMAST, and a Phenix based method by producing the most complete predicted protein structures, as measured by percentage of found Cα atoms. This method accurately predicted 88.9% (mean) of the Cα atoms within 3 Å of a protein’s backbone structure surpassing the 66.8% mark achieved by the leading alternate method (Phenix based fully automatic method) on the same set of density maps. The C-CNN also achieved an average root-mean-square deviation (RMSD) of 1.24 Å on a set of 50 experimental density maps which was tested by the Phenix based fully automatic method. The source code and demo of this research has been published at https://github.com/DrDongSi/Ca-Backbone-Prediction.
Publisher
Springer Science and Business Media LLC
Reference51 articles.
1. Berg, J. M. et al. Biochemisty: International version (hardcover). (W. H. Freeman, New York, 2002).
2. Bai, X. C., McMullan, G. & Scheres, S. H. How cryo-EM is revolutionizing structural biology. Trends in Biochemical Sciences 40(1), 49–57 (2015).
3. Nogales, E. & Scheres, S. H. Cryo-EM: A Unique Tool for the Visualization of Macromolecular Complexity. Molecular Cell 58(4), 677–689 (2015).
4. Wang, L. & Sigworth, F. J. Cryo-EM and single particles. Physiology 21(1), 13–18 (2006).
5. Merk, A. et al. Breaking Cryo-EM Resolution Barriers to Facilitate Drug Discovery. Cell 165(7), 1698–1707 (2016).
Cited by
59 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献