Abstract
AbstractBackgroundAmino acids are the building blocks of proteins and enzymes, which are pivotal for life on Earth. Amino acid usage provides critical insights into the functional constraints acting on proteins and illuminates molecular mechanisms underpinning traits. Despite this, we have limited knowledge of the genome-wide signatures of amino acid usage across domains of life, precluding new genome and proteome patterns to being discovered.ResultsHere, we analysed the proteomes of 5,590 species across four domains of life and found that only a small subset of amino acids is most and least frequently used across proteomes. This creates a ubiquitous ‘edge effect’ on amino acid usage diversity by rank that arises from protein secondary structural constrains. This edge effect was not driven by the evolutionary chronology of amino acids, showing that functional rather than evolutionary constrains shape amino acid usage in the proteome. We also tested contemporary hypotheses about similarities in amino acid usage profiles and the relationship between amino acid usage and growth temperature, and found that, contrary to previous beliefs, amino acid usage varies across domains of life and temperature only weakly contributes to variance in amino acid usage.ConclusionWe have described a novel and ubiquitous pattern of amino acid usage signature across genomes, which reveals how structural constrains shape amino acid usage at the proteome level. This can ultimately influence the way in which we probe deep evolutionary relationships of protein families across the tree of life and engineer biology in synthetic biology.
Publisher
Cold Spring Harbor Laboratory