Abstract
AbstractTranscription activation domains (ADs) are encoded by a wide range of seemingly unrelated amino acid sequences, making it difficult to recognize features that permit their dynamic behavior, fuzzy interactions and target specificity. We screened a large set of random 30-mer peptides for AD function and trained a deep neural network (ADpred) on the AD-positive and negative sequences. ADpred correctly identifies known ADs within protein sequences and accurately predicts the consequences of mutations. We show that functional ADs are (1) located within intrinsically disordered regions with biased amino acid composition, (2) contain clusters of hydrophobic residues near acidic side chains, (3) are enriched or depleted for particular dipeptide sequences, and (4) have higher helical propensity than surrounding regions. Taken together, our findings fit the model of “fuzzy” binding through hydrophobic protein-protein interfaces, where activator-coactivator binding takes place in a dynamic hydrophobic environment rather than through combinations of sequence-specific interactions.
Publisher
Cold Spring Harbor Laboratory