Abstract
AbstractAs splicing is intimately coupled with transcription, understanding splicing mechanisms requires an understanding of splicing timing, which is currently limited. Here, we developed CoLa-seq (co-transcriptional lariat sequencing), a genomic assay that reports splicing timing relative to transcription through analysis of nascent lariat intermediates. In human cells, we mapped 165,282 branch points and characterized splicing timing for over 70,000 introns. Splicing timing varies dramatically across introns, with regulated introns splicing later than constitutive introns. Machine learning-based modeling revealed genetic elements predictive of splicing timing, notably the polypyrimidine tract, intron length, and regional GC content, which illustrate the significance of the broader genomic context of an intron and the impact of co-transcriptional splicing. The importance of the splicing factor U2AF in early splicing rationalizes surprising observations that most introns can splice independent of exon definition. Together, these findings establish a critical framework for investigating the mechanisms and regulation of co-transcriptional splicing.HighlightsCoLa-seq enables cell-type specific, genome-wide branch point annotation with unprecedented efficiency.CoLa-seq captures co-transcriptional splicing for tens of thousands of introns and reveals splicing timing varies dramatically across introns.Modeling uncovers key genetic determinants of splicing timing, most notably regional GC content, intron length, and the polypyrimidine tract, the binding site for U2AF2.Early splicing precedes transcription of a downstream 5’ SS and in some cases accessibility of the upstream 3’ SS, precluding exon definition.
Publisher
Cold Spring Harbor Laboratory
Reference108 articles.
1. Differential GC Content between Exons and Introns Establishes Distinct Strategies of Splice-Site Recognition
2. MEME SUITE: tools for motif discovery and searching
3. Exon Recognition in Vertebrate Splicing
4. Bergstra, J. , Yamins, D. , and Cox, D.D. (2013). Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, (Atlanta, GA, USA: JMLR.org), p. I-115–I–123.
5. Structure of a transcribing RNA polymerase II–DSIF complex reveals a multidentate DNA–RNA clamp