Abstract
ABSTRACTIn eukaryotic mRNAs, upstream open reading frames (uORFs) in the 5′ untranslated regions (5′ UTRs) often attenuate the translation of downstream main ORFs (mORFs). While some uORFs are beneficial by playing important regulatory roles, uORFs are generally disfavored in evolution. Here we studied how uORF repression is suppressed in Arabidopsis. We found that the heterogeneous distribution of transcription start sites (TSSs) results in heterogeneous 5′ UTRs that selectively exclude uORFs from mRNAs. Thus, only a subset of the transcripts from “uORF-containing” genes truly contain uORFs. Importantly, the fraction of uORFs remaining within transcripts determines uORF overall repressiveness. Interestingly, uORFs that encode conserved peptides are almost exclusively preserved within mRNAs, implying coevolution between TSSs and functional uORFs. Consistent with TSSs determining uORF presence, a sharp transition of AUG frequency between promoters and 5′ UTRs was observed, and this pattern differentiates between genes lacking and carrying translated uORFs. Remarkably, while 55% of the genes are predicted to contain uORFs, upon accounting for the heterogeneous TSSs, only 9% of the transcripts within the mRNA pool genuinely contain uORFs. Our results highlight a profound effect of TSS distribution in determining uORF repressiveness, a factor that was previously overlooked. As a warning note, the TSS heterogeneity should be taken into consideration when studying various 5′ UTR features, such as RNA structures and protein binding motifs, in post-transcriptional gene regulation. The uORFs and other features preferentially preserved in 5′ UTRs (i.e., downstream of TSSs) are more likely to be functional as the result of natural selection.
Publisher
Cold Spring Harbor Laboratory