Abstract
AbstractThe knowledge of protein stability upon residue variation is an important step for functional protein design and for understanding how protein variants can promote disease onset. Computational methods are important to complement experimental approaches and allow a fast screening of large datasets of variations. In this work we present DDGemb, a novel method combining protein language model embeddings and transformer architectures to predict protein ΔΔG upon both single- and multi-point variations. DDGemb has been trained on a high-quality dataset derived from literature and tested on available benchmark datasets of single- and multi-point variations. DDGemb performs at the state of the art in both single- and multi-point variations.
Publisher
Cold Spring Harbor Laboratory