Author:
Adams Matthew,Vollmers Christopher
Abstract
AbstractGenerating an accurate and complete genome annotation for an organism is complex because the cells within each tissue can express a unique set of transcript isoforms from a unique set of genes. A comprehensive genome annotation should contain information on what tissues express what transcript isoforms at what level. This tissue-level isoform information can then inform a wide range of research questions as well as experiment designs. Long-read sequencing technology combined with advanced full-length cDNA library preparation methods has now achieved throughput and accuracy where generating these types of annotations is achievable.Here, we show this by generating a genome annotation of the mouse (Mus musculus). We used the nanopore-based R2C2 long-read sequencing method to generate 64 million highly accurate full length cDNA consensus reads - averaging 5.4 million reads per tissue for a dozen tissues. Using the Mandalorion tool we processed these reads to generate theTissue-levelAtlas ofMouseIsoforms (TAMI - available athttps://genome.ucsc.edu/s/vollmers/TAMI) which we believe will be a valuable complement to conventional, manually curated reference genome annotations.
Publisher
Cold Spring Harbor Laboratory