Abstract
Medaka (Oryzias latipes) has become an important vertebrate model widely used in genetics, developmental biology, environmental sciences, and many other fields. A high-quality genome sequence and a variety of genetic tools are available for this model organism. However, existing genome annotation is still rudimentary, as it was mainly based on computational prediction and short-read RNA-seq data. Here we report a dynamic transcriptome landscape of medaka embryogenesis profiled by long-read RNA-seq, short-read RNA-seq, and ATAC-seq. By integrating these data sets, we constructed a much-improved gene model set including about 17,000 novel isoforms and identified 1600 transcription factors, 1100 long noncoding RNAs, and 150,000 potential cis-regulatory elements as well. Time-series data sets provided another dimension of information. With the expression dynamics of genes and accessibility dynamics of cis-regulatory elements, we investigated isoform switching, as well as regulatory logic between accessible elements and genes, during embryogenesis. We built a user-friendly medaka omics data portal to present these data sets. This resource provides the first comprehensive omics data sets of medaka embryogenesis. Ultimately, we term these three assays as the minimum ENCODE toolbox and propose the use of it as the initial and essential profiling genomic assays for model organisms that have limited data available. This work will be of great value for the research community using medaka as the model organism and many others as well.
Funder
National Natural Science Foundation of China
Chinese Academy of Sciences
National Institutes of Natural Sciences
Publisher
Cold Spring Harbor Laboratory
Subject
Genetics (clinical),Genetics
Cited by
26 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献