Abstract
High-quality, chromosome-scale genomes are essential for genomic analyses. Analyses, including 3D genomics, epigenetics, and comparative genomics rely on a high-quality genome assembly, which is often accomplished with the assistance of Hi-C data. Curation of genomes reveal that current Hi-C-assisted scaffolding algorithms either generate ordering and orientation errors or fail to assemble high-quality chromosome-level scaffolds. Here, we offer the software Puzzle Hi-C, which uses Hi-C reads to accurately assign contigs or scaffolds to chromosomes. Puzzle Hi-C uses the triangle region instead of the square region to count interactions in a Hi-C heatmap. This strategy dramatically diminishes scaffolding interference caused by long-range interactions. This software also introduces a dynamic, triangle window strategy during assembly. Initially small, the window expands with interactions to produce more effective clustering. Puzzle Hi-C outperforms available scaffolding tools.
Funder
National Natural Science Foundation of China
Yunnan Fundamental Research Projects
National Key R&D Program of China
State Key Laboratory for Conservation and Utilization of Bio-resource in Yunnan
Publisher
Public Library of Science (PLoS)
Reference34 articles.
1. Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2;A Kaul;Nat Protoc,2020
2. HiTAD: Detecting the structural and functional hierarchies of topologically associating domains from chromatin interactions;XT Wang;Nucleic Acids Res,2017
3. Natural selection shaped the rise and fall of passenger pigeon genomic diversity;GGR Murray;Science,2017
4. Reconstruction of the diapsid ancestral genome permits chromosome evolution tracing in avian and non-avian dinosaurs;RE O’Connor;Nat Commun,2018
5. Elimination of a long-range cis-regulatory module causes complete loss of limb-specific Shh expression and truncation of the mouse limb;T Sagai;Development,2005