Author:
Naftaly Alice S.,Pau Shana,White Michael A.
Abstract
AbstractAlternate isoforms contribute immensely to phenotypic diversity across eukaryotes. While short read RNA-sequencing has increased our understanding of isoform diversity, it is challenging to accurately detect full-length transcripts, preventing the identification of many alternate isoforms. Long-read sequencing technologies have made it possible to sequence full length alternative transcripts, accurately characterizing alternative splicing events, alternate transcription start and end sites, and differences in UTR regions. Here, we utilize PacBio long read RNA-sequencing (Iso-Seq) to examine the transcriptomes of five tissues in threespine stickleback fish (Gasterosteus aculeatus), a widely used genetic model species. The threespine stickleback fish has a refined genome assembly with gene annotations that are based on short-read RNA sequencing and predictions from coding sequence of other species. This suggests some of the existing annotations may be inaccurate or alternative transcripts may not be fully characterized. Using Iso-Seq we detected thousands of novel isoforms, indicating many isoforms are absent in the current Ensembl gene annotations. In addition, we refined many of the existing annotations within the genome. We noted many improperly positioned transcription start sites that were refined with long-read sequencing. The Iso-Seq predicted transcription start sites were more accurate, verified through ATAC-seq. We were also able to detect many alternative splicing events between sexes and across tissues. We found a substantial number of genes in both somatic and gonad tissue that had sex-specific isoforms. Our study highlights the power of long-read sequencing to study the complexity of transcriptomes, greatly improving genomic resources for the threespine stickleback fish.
Publisher
Cold Spring Harbor Laboratory
Reference103 articles.
1. A survey of the sorghum transcriptome using single-molecule long reads
2. Ahmad I , Valverde A , Ahmad F , Naqvi AR . 2020. Long Noncoding RNA in Myeloid and Lymphoid Cell Differentiation, Polarization and Function. Cells 9.
3. Aken BL , Ayling S , Barrell D , Clarke L , Curwen V , Fairley S , Fernandez Banet J , Billis K , Garcia Giron C , Hourlier T et al. 2016. The Ensembl gene annotation system. Database (Oxford) 2016.
4. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献