Author:
Taggart Allison J.,Lin Chien-Ling,Shrestha Barsha,Heintzelman Claire,Kim Seongwon,Fairbrother William G.
Abstract
The coding sequence of each human pre-mRNA is interrupted, on average, by 11 introns that must be spliced out for proper gene expression. Each intron contains three obligate signals: a 5′ splice site, a branch site, and a 3′ splice site. Splice site usage has been mapped exhaustively across different species, cell types, and cellular states. In contrast, only a small fraction of branch sites have been identified even once. The few reported annotations of branch site are imprecise as reverse transcriptase skips several nucleotides while traversing a 2–5 linkage. Here, we report large-scale mapping of the branchpoints from deep sequencing data in three different species and in the SF3B1 K700E oncogenic mutant background. We have developed a novel method whereby raw lariat reads are refined by U2snRNP/pre-mRNA base-pairing models to return the largest current data set of branchpoint sequences with quality metrics. This analysis discovers novel modes of U2snRNA:pre-mRNA base-pairing conserved in yeast and provides insight into the biogenesis of intron circles. Finally, matching branch site usage with isoform selection across the extensive panel of ENCODE RNA-seq data sets offers insight into the mechanisms by which branchpoint usage drives alternative splicing.
Funder
National Institutes of Health
National Institute of General Medical Sciences
NCRR
National Science Foundation
Lifespan Rhode Island Hospital
Brown University
NIGMS
Publisher
Cold Spring Harbor Laboratory
Subject
Genetics (clinical),Genetics
Cited by
102 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献