A near-optimal approach to edge connectivity-based hierarchical graph decomposition

Author:

Chang Lijun,Wang Zhiyi

Abstract

AbstractThe problem of efficiently computing all $$k$$ k -edge-connected components ($$k$$ k -ECCs) of a graph G for a user-givenk has been extensively studied recently in view of its importance in many applications. The $$k$$ k -ECCs of G for all possible values ofk form a hierarchical structure; that is, any two different $$k$$ k -ECCs for the same k value are disjoint and any $$k$$ k -ECC is contained in a unique $$(k\text {-}1)$$ ( k - 1 ) -ECC. In this paper, we study the problem of efficiently constructing the hierarchy tree of the $$k$$ k -ECCs for all possible k values, for a graph G. The existing approaches $$\textsf{TD}$$ TD and $$\textsf{BU}$$ BU construct the hierarchy tree in either a top-down manner or a bottom-up manner, with both having the time complexity of $${{\mathcal {O}}}\big (\delta (G)\times {\mathsf {T_{KECC}}} (G)\big )$$ O ( δ ( G ) × T KECC ( G ) ) , where $$\delta (G)$$ δ ( G ) is the degeneracy of G and $${\mathsf {T_{KECC}}} (G)$$ T KECC ( G ) is the time complexity of computing all $$k$$ k -ECCs of G for a specific k value. Here, the degeneracy of G is defined as the maximum value among the minimum vertex degrees of all subgraphs of G and is at most $$\sqrt{m}$$ m where m is the number of edges in G. To improve the time complexity, we propose a divide-and-conquer approach $$\textsf{DC}$$ DC running in $${{\mathcal {O}}}\big ( (\log \delta (G))\times {\mathsf {T_{KECC}}} (G)\big )$$ O ( ( log δ ( G ) ) × T KECC ( G ) ) time; this time complexity is optimal up to a logarithmic factor. However, a straightforward implementation of $$\textsf{DC}$$ DC would take $${{\mathcal {O}}}( (m + n) \log \delta (G))$$ O ( ( m + n ) log δ ( G ) ) main-memory space, which could easily run out-of-memory when processing large graphs; here, n is the number of vertices in G. To reduce the main-memory footprint of our algorithm, we propose adjacency array-based techniques to optimize the space complexity to $$2m+{{\mathcal {O}}}(n\log \delta (G))$$ 2 m + O ( n log δ ( G ) ) and denote our resulting algorithm by $$\mathsf {DC\text {-}AA}$$ DC - AA . As a by-product of $$\mathsf {DC\text {-}AA}$$ DC - AA , we also improve the space complexity of the state-of-the-art algorithm for computing all $$k$$ k -ECCs for a specific k to $$2m + {{\mathcal {O}}}(n)$$ 2 m + O ( n ) , by using the same technique as used in $$\mathsf {DC\text {-}AA}$$ DC - AA . Finally, we propose optimization techniques to improve the practical efficiency of the existing approach $$\textsf{BU}$$ BU and denote the space-optimized version of it as $$\mathsf {BU^*\text {-}AA}$$ BU - AA which runs in $${{\mathcal {O}}}\big (\delta (G)\times {\mathsf {T_{KECC}}} (G)\big )$$ O ( δ ( G ) × T KECC ( G ) ) time and $$2m+{{\mathcal {O}}}(n)$$ 2 m + O ( n ) space. Extensive experiments on large real graphs and synthetic graphs demonstrate that our algorithms $$\mathsf {DC\text {-}AA}$$ DC - AA and $$\mathsf {BU^*\text {-}AA}$$ BU - AA outperform the state-of-the-art approaches by up to 28 times in terms of running time and by up to 8 times in terms of main memory usage. In particular, our approach $$\mathsf {BU^*\text {-}AA}$$ BU - AA processes the Twitter graph, which has more than 1 billion undirected edges, in 29 min with 13.5 GB memory, while the state-of-the-art approaches take more than 13 h after our space optimization; note that the state-of-the-art approaches run out-of-memory if without our space optimization. Our empirical study also shows that $$\mathsf {BU^*\text {-}AA}$$ BU - AA , despite having a higher time complexity, performs better than $$\mathsf {DC\text {-}AA}$$ DC - AA in practice. We also remark that $$\mathsf {BU^*\text {-}AA}$$ BU - AA is much simpler and easier to implement than $$\mathsf {DC\text {-}AA}$$ DC - AA .

Funder

Australian Research Council

Publisher

Springer Science and Business Media LLC

Subject

Hardware and Architecture,Information Systems

Reference43 articles.

1. Aggarwal, C.C., Xie, Y., Philip, S.Y.: Gconnect: a connectivity index for massive disk-resident graphs. PVLDB 2(1), 862–873 (2009)

2. Agrawal, R., Rajagopalan, S., Srikant, R., Xu, Y.: Mining newsgroups using networks arising from social behavior. In: Proceedings of WWW’03, pp. 529–535 (2003)

3. Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: Proceedings of CIKM’13, pp. 909–918 (2013)

4. Batagelj, V., Zaversnik, M.: An o(m) algorithm for cores decomposition of networks. CoRR arXiv:cs.DS/0310049 (2003)

5. Benczúr, A.A., Karger, D.R.: Randomized approximation schemes for cuts and flows in capacitated graphs. CoRR arXiv:cs.DS/0207078 (2002)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3