Abstract
AbstractTo better understand DNA’s 3D folding in cell nuclei, researchers developed chromosome capture methods such as Hi-C that measure the contact frequencies between all DNA segment pairs across the genome. As Hi-C data sets often are massive, it is common to use bioinformatics methods to group DNA segments into 3D regions with correlated contact patterns, such as Topologically associated domains and A/B compartments. Recently, another research direction emerged that treats the Hi-C data as a network of 3D contacts. In this representation, one can use community detection algorithms from complex network theory that group nodes into tightly connected mesoscale communities. However, because Hi-C networks are so densely connected, several node partitions may represent feasible solutions to the community detection problem but are indistinguishable unless including other data. Because this limitation is a fundamental property of the network, this problem persists regardless of the community-finding or data-clustering method. To help remedy this problem, we developed a method that charts the solution landscape of network partitions in Hi-C data from human cells. Our approach allows us to scan seamlessly through the scales of the network and determine regimes where we can expect reliable community structures. We find that some scales are more robust than others and that strong clusters may differ significantly. Our work highlights that finding a robust community structure hinges on thoughtful algorithm design or method cross-evaluation.
Funder
Stiftelsen för Strategisk Forskning
Vetenskapsrådet
Umea University
Publisher
Springer Science and Business Media LLC
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献