Computational Scoring and Experimental Evaluation of Enzymes Generated by Neural Networks-Reference-Cited by-同舟云学术

Computational Scoring and Experimental Evaluation of Enzymes Generated by Neural Networks

Published:2023-03-04 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Johnson Sean R.^ORCID,Fu Xiaozhi^ORCID,Viknander Sandra^ORCID,Goldin Clara,Monaco Sarah^ORCID,Zelezniak Aleksej^ORCID,Yang Kevin K.^ORCID

Abstract

AbstractIn recent years, generative protein sequence models have been developed to sample novel sequences. However, predicting whether generated proteins will fold and function remains challenging. We evaluate computational metrics to assess the quality of enzyme sequences produced by three contrasting generative models: ancestral sequence reconstruction, a generative adversarial network, and a protein language model. Focusing on two enzyme families, we expressed and purified over 440 natural and generated sequences with 70-90% identity to the most similar natural sequences to benchmark computational metrics for predictingin vitroenzyme activity. Over three rounds of experiments, we developed a computational filter that improved experimental success rates by 44-100%. Surprisingly, neither sequence identity to natural sequences nor AlphaFold2 residue-confidence scores were predictive of enzyme activity. The proposed metrics and models will drive protein engineering research by serving as a benchmark for generative protein sequence models and helping to select active variants to test experimentally.

Publisher

Cold Spring Harbor Laboratory

Reference64 articles.

1. Engineering the third wave of biocatalysis

2. Recombinant protein expression in Escherichia coli: advances and challenges

3. Directed Evolution: Bringing New Chemistry to Life

4. Protein Design by Directed Evolution

5. Natural Selection and the Concept of a Protein Space

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Addressing epistasis in the design of protein function;Proceedings of the National Academy of Sciences;2024-08-12

2. Application of Directed Evolution and Machine Learning to Enhance the Diastereoselectivity of Ketoreductase for Dihydrotetrabenazine Synthesis;JACS Au;2024-06-26

3. Conditional language models enable the efficient design of proficient enzymes;2024-05-05

4. Embracing data science in catalysis research;Nature Catalysis;2024-04-23

5. Combining Rosetta Sequence Design with Protein Language Model Predictions Using Evolutionary Scale Modeling (ESM) as Restraint;ACS Synthetic Biology;2024-04-03