Abstract
AbstractPacific saury (Cololabis saira) is a commercially important small pelagic fish species in Asian. In this study, we conducted the first-ever whole genome sequencing of this species, with single molecule, real-time (SMRT) sequencing technology. The obtained high-fidelity (HiFi) long-read sequence data, which amount to approximately 30 folds of its haploid genome size that was measured with quantitative PCR (1.17 Gb), were assembled into contigs. Scaffolding with Hi-C reads yielded a whole genome assembly containing 24 chromosome-scale sequences, with a scaffold N50 length of 47.7 Mb. Screening of repetitive elements including telomeric repeats was performed to characterize possible factors that need to be resolved towards ‘telomere-to-telomere’ sequencing. The larger genome size than in medaka, a close relative in Beloniformes, is at least partly explained by larger repetitive element quantity, which is reflected in more abundant tRNAs, in the Pacific saury genome. Protein-coding regions was predicted using transcriptome data, which resulted in 22,274 components. Retrieval of Pacific saury homologs of aquaporin (AQP) genes known from other teleost fishes validated high completeness and continuity of the genome assembly. These resources are available athttps://treethinkers.nig.ac.jp/saira/and will assist various molecular-level studies in fishery science and comparative biology.
Publisher
Cold Spring Harbor Laboratory