Abstract
Eukaryotic mRNAs contain a 5′ leader sequence preceding the main open reading frame (mORF) and, depending on the species, 20%–50% of eukaryotic mRNAs harbor an upstream ORF (uORF) in the 5′ leader. An unknown fraction of these uORFs encode sequence conserved peptides (conserved peptide uORFs, CPuORFs). Experimentally validated CPuORFs demonstrated to regulate the translation of downstream mORFs often do so in a metabolite concentration-dependent manner. Previous research has shown that most CPuORFs possess a start codon context suboptimal for translation initiation, which turns out to be favorable for translational regulation. The suboptimal initiation context may even include non-AUG start codons, which makes CPuORFs hard to predict. For this reason, we developed a novel pipeline to identify CPuORFs unbiased of start codon using well-annotated sequence data from 31 eudicot plant species and rice. Our new pipeline was able to identify 29 novel Arabidopsis thaliana (Arabidopsis) CPuORFs, conserved across a wide variety of eudicot species of which 15 do not initiate with an AUG start codon. In addition to CPuORFs, the pipeline was able to find 14 conserved coding regions directly upstream and in frame with the mORF, which likely initiate translation on a non-AUG start codon. Altogether, our pipeline identified highly conserved coding regions in the 5′ leaders of Arabidopsis transcripts, including in genes with proven functional importance such as LHY, a key regulator of the circadian clock, and the RAPTOR1 subunit of the target of rapamycin (TOR) kinase.
Funder
Bio4Energy
Strategic Research Environment appointed by the Swedish government and ALW-NOW
Publisher
Cold Spring Harbor Laboratory
Cited by
34 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献