Protein Sequence Design by Entropy-based Iterative Refinement-Reference-Cited by-同舟云学术

Protein Sequence Design by Entropy-based Iterative Refinement

Published:2023-02-04 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Zhou Xinyi,Chen Guangyong,Ye Junjie,Wang Ercheng,Zhang Jun,Mao Cong,Li Zhanwei,Hao Jianye,Huang Xingxu,Tang Jin,Ann Heng Pheng

Abstract

AbstractInverse Protein Folding (IPF) is an important task of protein design, which aims to design sequences compatible with a given backbone structure. Despite the prosperous development of algorithms for this task, existing methods tend to leverage limited and noisy residue environment when generating sequences. In this paper, we develop an iterative sequence refinement pipeline, which can refine the sequence generated by existing sequence design models. It selects and retains reliable predictions based on the model’s confidence in predicted distributions, and decodes the residue type based on a partially visible environment. The proposed scheme can consistently improve the performance of a number of IPF models on several sequence design benchmarks, and increase sequence recovery of the SOTA model by up to 10%. We finally show that the proposed model can be applied to redesign Transposon-associated transposase B. 8 variants exhibit improved gene editing activity among the 20 variants we proposed. Our code and a demo of the refinement pipeline are provided in the online colab.

Publisher

Cold Spring Harbor Laboratory

Reference45 articles.

1. Gao W , Mahajan SP , Sulam J , Gray JJ . Deep learning in protein structural modeling and design. Patterns. 2020;p. 100142.

2. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy;Nature chemical biology,2016

3. Control over overall shape and size in de novo designed proteins

4. Anand-Achim N , Eguchi RR , Mathews II , Perez CP , Derry A , Altman RB , et al. Protein sequence design with a learned potential. bioRxiv. 2021;p. 2020–01.

5. The Rosetta all-atom energy function for macromolecular modeling and design;Journal of chemical theory and computation,2017

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Protein Manufacture: Protein Design Assisted by Machine Learning from Backbone to Sequence;Lecture Notes in Computer Science;2024

2. Protein sequence design on given backbones with deep learning;Protein Engineering, Design and Selection;2023-12-29

3. A new age in protein design empowered by deep learning;Cell Systems;2023-11

4. Context-aware geometric deep learning for protein sequence design;2023-06-19