Author:
Vaughn Justin N.,Ellingson Sally R.,Mignone Flavio,von Arnim Albrecht
Abstract
The sequence elements that mediate post-transcriptional gene regulation often reside in the 5′ and 3′ untranslated regions (UTRs) of mRNAs. Using six different families of dicotyledonous plants, we developed a comparative transcriptomics pipeline for the identification and annotation of deeply conserved regulatory sequences in the 5′ and 3′ UTRs. Our approach was robust to confounding effects of poor UTR alignability and rampant paralogy in plants. In the 3′ UTR, motifs resembling PUMILIO-binding sites form a prominent group of conserved motifs. Additionally, Expansins, one of the few plant mRNA families known to be localized to specific subcellular sites, possess a core conserved RCCCGC motif. In the 5′ UTR, one major subset of motifs consists of purine-rich repeats. A distinct and substantial fraction possesses upstream AUG start codons. Half of the AUG containing motifs reveal hidden protein-coding potential in the 5′ UTR, while the other half point to a peptide-independent function related to translation. Among the former, we added four novel peptides to the small catalog of conserved-peptide uORFs. Among the latter, our case studies document patterns of uORF evolution that include gain and loss of uORFs, switches in uORF reading frame, and switches in uORF length and position. In summary, nearly three hundred post-transcriptional elements show evidence of purifying selection across the eudicot branch of flowering plants, indicating a regulatory function spanning at least 70 million years. Some of these sequences have experimental precedent, but many are novel and encourage further exploration.
Publisher
Cold Spring Harbor Laboratory
Reference88 articles.
1. Comprehensive prediction of novel microRNA targets in Arabidopsis thaliana
2. Vertebrate mRNAs with a 5′-terminal pyrimidine tract are candidates for translational repression in quiescent cells: Characterization of the translational cis-regulatory element;Mol Cell Biol,1994
3. Update of ASRP: the Arabidopsis Small RNA Project database
4. Genome-Scale Proteomics Reveals
Arabidopsis thaliana
Gene Models and Proteome Dynamics
5. Fitting a mixture model by expectation maximization to discover motifs in biopolymers;Proc Int Conf Intell Syst Mol Biol,1994
Cited by
61 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献