Highly accurate protein structure prediction for the human proteome-Reference-Cited by-同舟云学术

Highly accurate protein structure prediction for the human proteome

Published:2021-07-22 Issue:7873 Volume:596 Page:590-596
ISSN:0028-0836
Container-title:Nature
language:en
Short-container-title:Nature

Author:

Tunyasuvunakool Kathryn^ORCID,Adler Jonas^ORCID,Wu Zachary,Green Tim^ORCID,Zielinski Michal,Žídek Augustin,Bridgland Alex,Cowie Andrew,Meyer Clemens,Laydon Agata,Velankar Sameer^ORCID,Kleywegt Gerard J.^ORCID,Bateman Alex^ORCID,Evans Richard^ORCID,Pritzel Alexander,Figurnov Michael,Ronneberger Olaf,Bates Russ,Kohl Simon A. A.^ORCID,Potapenko Anna,Ballard Andrew J.,Romera-Paredes Bernardino,Nikolov Stanislav,Jain Rishub,Clancy Ellen,Reiman David,Petersen Stig,Senior Andrew W.^ORCID,Kavukcuoglu Koray,Birney Ewan^ORCID,Kohli Pushmeet,Jumper John^ORCID,Hassabis Demis^ORCID

Abstract

AbstractProtein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure1. Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold2, at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective.

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

https://www.nature.com/articles/s41586-021-03828-1.pdf

Reference75 articles.

1. SWISS-MODEL. Homo sapiens (human). https://swissmodel.expasy.org/repository/species/9606 (2021).

2. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature https://doi.org/10.1038/s41586-021-03819-2 (2021).

3. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

4. Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).

5. wwPDB Consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 47, D520–D528 (2018).

Cited by 2065 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Insights into the functional mechanism of the non-specific lipid transfer protein nsLTP in Kalanchoe fedtschenkoi (Lavender scallops);Protein Expression and Purification;2025-02

2. SpatialPPI: Three-dimensional space protein-protein interaction prediction with AlphaFold Multimer;Computational and Structural Biotechnology Journal;2024-12

3. Differential prolyl hydroxylation by six Physcomitrella prolyl-4 hydroxylases;Computational and Structural Biotechnology Journal;2024-12

4. Multi-omics study on the mixed culture of Trichoderma reesei and Aspergillus niger with improved lignocellulase production;Biomass and Bioenergy;2024-11

5. Efficient cytosolic delivery of proteins enabled by modular fusion protein;Chemical Engineering Journal;2024-10