Abstract
AbstractMany intrinsic functional characteristics of cells and tissues shape their genome-wide expression patterns. But what other factors might also modulate these patterns are not fully known. Here, we revisit the general model of costs in which the protein products of highly expressed genes should be short and made up of biosynthetically cheap amino acids. We first use single-cell expression data from a collection of human cell types to confirm this model with a twist: the most highly expressed proteins tend to be particularly short and use expensive amino acids. By clustering how these two factors change with expression across all cell types, we identified a set of archetypal profiles that uniquely balance costs and that occur at different proportion across cell types. Similar profiles were also found by examining the expression data of tissues, which allowed us to recognize those following a more or less costly strategy. We then asked how this model might delineate the expression changes seen in a tumor relative to its normal solid tissue, as it has been argued that energy constraints determine cancer progression. We discovered a strong signal for the overexpression of biosynthetically cheap compact genes in cancer tissues. Our work highlights how both aspects of the metabolic cost of a protein, length and amino acid biosynthesis, represent valuable measures for understanding the different levels of biological organization and also the differences between health and disease.
Publisher
Cold Spring Harbor Laboratory