De novo protein design by deep network hallucination-Reference-Cited by-同舟云学术

De novo protein design by deep network hallucination

Published:2020-07-23 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Anishchenko Ivan^ORCID,Chidyausiku Tamuka M.^ORCID,Ovchinnikov Sergey^ORCID,Pellock Samuel J.^ORCID,Baker David^ORCID

Abstract

AbstractThere has been considerable recent progress in protein structure prediction using deep neural networks to infer distance constraints from amino acid residue co-evolution1–3. We investigated whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occuring proteins used in training the models. We generated random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting distance maps, which as expected are quite featureless. We then carried out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (KL-divergence) between the distance distributions predicted by the network and the background distribution. Optimization from different random starting points resulted in a wide range of proteins with diverse sequences and all alpha, all beta sheet, and mixed alpha-beta structures. We obtained synthetic genes encoding 129 of these network hallucinated sequences, expressed and purified the proteins in E coli, and found that 27 folded to monomeric stable structures with circular dichroism spectra consistent with the hallucinated structures. Thus deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute, alongside traditional physically based models, to the de novo design of proteins with new functions.

Publisher

Cold Spring Harbor Laboratory

Reference23 articles.

1. Distance-based protein folding powered by deep learning

2. Improved protein structure prediction using potentials from deep learning;Nature,2020

3. Improved protein structure prediction using predicted interresidue orientations

4. Low-N protein engineering with data-efficient deep learning

5. ProGen: Language Modeling for Protein Generation

Cited by 38 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The Potential of Purinergic Signaling to Thwart Viruses Including SARS-CoV-2;Frontiers in Immunology;2022-06-17

2. Hallucinating structure-conditioned antibody libraries for target-specific binders;2022-06-06

3. A consensus view on the folding mechanism of protein G, L and their mutants;2022-04-08

4. AlphaFold encodes the principles to identify high affinity peptide binders;2022-03-19

5. Large-scale design and refinement of stable proteins using sequence-only models;PLOS ONE;2022-03-14