Abstract
AbstractWhile many computational methods accurately predict destabilizing mutations, identifying stabilizing mutations has remained a challenge, due to their relative rarity. We tested ΔΔG0predictions from computational predictors such as Rosetta, ThermoMPNN, RaSP, and DeepDDG, using eighty-two mutants of the bacterial toxin CcdB as a test case. On this dataset, the best computational predictor is ThermoMPNN which identifies stabilizing mutations with a precision of 68%. However, the average increase in Tmfor these predicted mutations was only 1°C for CcdB, and predictions were poorer for a more challenging target, influenza neuraminidase. Using data from multiple previously described yeast surface display libraries andin vitrothermal stability measurements, we trained logistic regression models to identify stabilizing mutations with a precision of 90% and an average increase in Tmof 3°C for CcdB. When such libraries contain a population of mutants with significantly enhanced binding relative to the corresponding wild type, there is no benefit in using computational predictors. It is then possible to predict stabilizing mutations without any training, simply by examining the distribution of mutational binding scores. This avoids laborious steps ofin vitroexpression, purification, and stability characterization. When this is not the case, combining data from computational predictors with high-throughput experimental binding data enhances the prediction of stabilizing mutations. However, this requires training on stability data measuredin vitrowith known stabilized mutants. It is thus feasible to predict stabilizing mutations rapidly and accurately for any system of interest that can be subjected to a binding selection or screen.
Publisher
Cold Spring Harbor Laboratory