Author:
Liu Silvia,Wu Indira,Yu Yan-Ping,Balamotis Michael,Ren Baoguo,Yehezkel Tuval Ben,Luo Jian-Hua
Abstract
AbstractDiversity in human gene expression stems, to a large extent, from splicing exons into multiple mRNA isoforms. Characterization of isoforms requires accurate long-read sequencing. However, read lengths, high error rates, low throughput and large input requirements are some of the challenges that remain to be addressed in sequencing technologies.In this study, we used a barcoding-based synthetic long read (SLR) isoform sequencing approach, LoopSeq, to generate sequencing reads sufficiently long and accurate to identify isoforms using standard short read Illumina sequencers. The method identifies isoforms from control RNA samples with 99.4% accuracy and a 0.01% per-base error rate, exceeding the accuracy reported for other long-read sequencing technologies.Applied to targeted transcriptome sequencing of over 10,000 genes from colon cancers and their metastatic counterparts, LoopSeq revealed large scale isoform redistributions from benign colon mucosa to primary colon cancer and metastatic cancer and identified several novel gene fusion isoforms in the colon cancer samples. Strikingly, our data showed that most single nucleotide variants (SNV’s) occurred dominantly in specific isoforms and that some SNVs underwent isoform switching in cancer progression.The ability to use short read sequencers to generate accurate long-read isoform information as the raw unit of transcriptional information holds promise as a new and widely accessible approach in RNA isoform analyses.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献