Abstract
AbstractSeveral recently introduced approaches use neural networks as probabilistic models for protein sequence design. These models use various objective functions and optimization schemes. The choice of objective function and optimization scheme comes with trade-offs that are not always well explained. We introduce probabilistic definitions of protein stability and conformational specificity and show how these chemical properties relate to thep(structure| seq) objective used in recent protein design algorithms. This links probabilistic objective functions to experimentally testable outcomes. We present a new sequence decoding algorithm, termed “BayesDesign”, that uses Bayes’ Rule to maximize thep(structure| seq) objective. We evaluate BayesDesign in the context of two protein model systems, the NanoLuc enzyme and the WW structural motif.
Publisher
Cold Spring Harbor Laboratory