Abstract
AbstractThe assembly of reference-quality, chromosome-resolution genomes for both model and novel eukaryotic organisms is an increasingly achievable task for single research teams. However, the overwhelming abundance of sequencing technologies, assembly algorithms, and post-assembly processing tools currently available means that there is no clear consensus on a best-practice computational protocol for eukaryoticde novogenome assembly. Here, we provide a comprehensive benchmark of 28 state-of-the-art assembly and polishing packages, in various combinations, when assembling two eukaryotic genomes using both next-generation (Illumina HiSeq) and third-generation (Oxford Nanopore and PacBio CLR) sequencing data, at both controlled and open levels of sequencing coverage. Recommendations are made for the most effective tools for each sequencing technology and the best performing combinations of methods, evaluated against common assessment metrics such as contiguity, computational performance, gene completeness, and reference reconstruction, across both organisms and across sequencing coverage depth.
Publisher
Cold Spring Harbor Laboratory
Reference76 articles.
1. Review on the computational genome annotation of sequences obtained by Next-Generation Sequencing;Biology,2020
2. New approaches for genome assembly and scaffolding;Annual Review of Animal Biosciences,2019
3. CRISPR/Cas9-mediated genome editing and gene replacement in plants: transitioning from lab to field;Plant Science,2015
4. Initial sequencing and analysis of the human genome;Nature,2001
5. The sequence of the human genome;Science,2001