Affiliation:
1. SISSA – Scuola Internazionale Superiore di Studi Avanzati, Trieste I-34077, Italy
2. Structural Genomics Group, CNAG-CRG Centre Nacional d’Análisi Genómica – Centre de Regulació Genómica, Barcelona 08028, Spain
Abstract
Abstract
Motivation
Hi-C matrices are cornerstones for qualitative and quantitative studies of genome folding, from its territorial organization to compartments and topological domains. The high dynamic range of genomic distances probed in Hi-C assays reflects in an inherent stochastic background of the interactions matrices, which inevitably convolve the features of interest with largely non-specific ones.
Results
Here, we introduce and discuss essHi-C, a method to isolate the specific or essential component of Hi-C matrices from the non-specific portion of the spectrum compatible with random matrices. Systematic comparisons show that essHi-C improves the clarity of the interaction patterns, enhances the robustness against sequencing depth of topologically associating domains identification, allows the unsupervised clustering of experiments in different cell lines and recovers the cell-cycle phasing of single-cells based on Hi-C data. Thus, essHi-C provides means for isolating significant biological and physical features from Hi-C matrices.
Availability and implementation
The essHi-C software package is available at https://github.com/stefanofranzini/essHIC.
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
Italian Ministry for University
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability