Abstract
AbstractProteins—molecular machines that underpin all biological life—are of significant therapeutic and industrial value. Directed evolution is a high-throughput experimental approach for improving protein function, but has difficulty escaping local maxima in the fitness landscape. Here, we investigate how supervised learning in a closed loop with DNA synthesis and high-throughput screening can be used to improve protein design. Using the green fluorescent protein (GFP) as an illustrative example, we demonstrate the opportunities and challenges of generating training datasets conducive to selecting strongly generalizing models. With prospectively designed wet lab experiments, we then validate that these models can generalize to unseen regions of the fitness landscape, even when constrained to explore combinations of non-trivial mutations. Taken together, this suggests a hybrid optimization strategy for protein design in which a predictive model is used to explore difficult-to-access but promising regions of the fitness landscape that directed evolution can then exploit at scale.
Publisher
Cold Spring Harbor Laboratory
Reference20 articles.
1. The coming of age of de novo protein design
2. Sewall Wright . The roles of mutation, inbreeding, crossbreeding, and selection in evolution, volume 1. na, 1932.
3. Exploring protein fitness landscapes by directed evolution
4. Victoria Pokusaeva , Dinara Usmanova , Ekaterina Putintseva , Lorena Espinar , Karen Sarkisyan , Alexander Mishin , Natalya Bogatyreva , Dmitry Ivankov , Guillaume Filion , Lucas Carey , et al. Experimental assay of a fitness landscape on a macroevolutionary scale. bioRxiv, page 222778, 2018.
5. Natural Selection and the Concept of a Protein Space
Cited by
28 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献