Abstract
AbstractThe problem of efficiently computing all $$k$$
k
-edge-connected components ($$k$$
k
-ECCs) of a graph G for a user-givenk has been extensively studied recently in view of its importance in many applications. The $$k$$
k
-ECCs of G for all possible values ofk form a hierarchical structure; that is, any two different $$k$$
k
-ECCs for the same k value are disjoint and any $$k$$
k
-ECC is contained in a unique $$(k\text {-}1)$$
(
k
-
1
)
-ECC. In this paper, we study the problem of efficiently constructing the hierarchy tree of the $$k$$
k
-ECCs for all possible k values, for a graph G. The existing approaches $$\textsf{TD}$$
TD
and $$\textsf{BU}$$
BU
construct the hierarchy tree in either a top-down manner or a bottom-up manner, with both having the time complexity of $${{\mathcal {O}}}\big (\delta (G)\times {\mathsf {T_{KECC}}} (G)\big )$$
O
(
δ
(
G
)
×
T
KECC
(
G
)
)
, where $$\delta (G)$$
δ
(
G
)
is the degeneracy of G and $${\mathsf {T_{KECC}}} (G)$$
T
KECC
(
G
)
is the time complexity of computing all $$k$$
k
-ECCs of G for a specific k value. Here, the degeneracy of G is defined as the maximum value among the minimum vertex degrees of all subgraphs of G and is at most $$\sqrt{m}$$
m
where m is the number of edges in G. To improve the time complexity, we propose a divide-and-conquer approach $$\textsf{DC}$$
DC
running in $${{\mathcal {O}}}\big ( (\log \delta (G))\times {\mathsf {T_{KECC}}} (G)\big )$$
O
(
(
log
δ
(
G
)
)
×
T
KECC
(
G
)
)
time; this time complexity is optimal up to a logarithmic factor. However, a straightforward implementation of $$\textsf{DC}$$
DC
would take $${{\mathcal {O}}}( (m + n) \log \delta (G))$$
O
(
(
m
+
n
)
log
δ
(
G
)
)
main-memory space, which could easily run out-of-memory when processing large graphs; here, n is the number of vertices in G. To reduce the main-memory footprint of our algorithm, we propose adjacency array-based techniques to optimize the space complexity to $$2m+{{\mathcal {O}}}(n\log \delta (G))$$
2
m
+
O
(
n
log
δ
(
G
)
)
and denote our resulting algorithm by $$\mathsf {DC\text {-}AA}$$
DC
-
AA
. As a by-product of $$\mathsf {DC\text {-}AA}$$
DC
-
AA
, we also improve the space complexity of the state-of-the-art algorithm for computing all $$k$$
k
-ECCs for a specific k to $$2m + {{\mathcal {O}}}(n)$$
2
m
+
O
(
n
)
, by using the same technique as used in $$\mathsf {DC\text {-}AA}$$
DC
-
AA
. Finally, we propose optimization techniques to improve the practical efficiency of the existing approach $$\textsf{BU}$$
BU
and denote the space-optimized version of it as $$\mathsf {BU^*\text {-}AA}$$
BU
∗
-
AA
which runs in $${{\mathcal {O}}}\big (\delta (G)\times {\mathsf {T_{KECC}}} (G)\big )$$
O
(
δ
(
G
)
×
T
KECC
(
G
)
)
time and $$2m+{{\mathcal {O}}}(n)$$
2
m
+
O
(
n
)
space. Extensive experiments on large real graphs and synthetic graphs demonstrate that our algorithms $$\mathsf {DC\text {-}AA}$$
DC
-
AA
and $$\mathsf {BU^*\text {-}AA}$$
BU
∗
-
AA
outperform the state-of-the-art approaches by up to 28 times in terms of running time and by up to 8 times in terms of main memory usage. In particular, our approach $$\mathsf {BU^*\text {-}AA}$$
BU
∗
-
AA
processes the Twitter graph, which has more than 1 billion undirected edges, in 29 min with 13.5 GB memory, while the state-of-the-art approaches take more than 13 h after our space optimization; note that the state-of-the-art approaches run out-of-memory if without our space optimization. Our empirical study also shows that $$\mathsf {BU^*\text {-}AA}$$
BU
∗
-
AA
, despite having a higher time complexity, performs better than $$\mathsf {DC\text {-}AA}$$
DC
-
AA
in practice. We also remark that $$\mathsf {BU^*\text {-}AA}$$
BU
∗
-
AA
is much simpler and easier to implement than $$\mathsf {DC\text {-}AA}$$
DC
-
AA
.
Funder
Australian Research Council
Publisher
Springer Science and Business Media LLC
Subject
Hardware and Architecture,Information Systems
Reference43 articles.
1. Aggarwal, C.C., Xie, Y., Philip, S.Y.: Gconnect: a connectivity index for massive disk-resident graphs. PVLDB 2(1), 862–873 (2009)
2. Agrawal, R., Rajagopalan, S., Srikant, R., Xu, Y.: Mining newsgroups using networks arising from social behavior. In: Proceedings of WWW’03, pp. 529–535 (2003)
3. Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: Proceedings of CIKM’13, pp. 909–918 (2013)
4. Batagelj, V., Zaversnik, M.: An o(m) algorithm for cores decomposition of networks. CoRR arXiv:cs.DS/0310049 (2003)
5. Benczúr, A.A., Karger, D.R.: Randomized approximation schemes for cuts and flows in capacitated graphs. CoRR arXiv:cs.DS/0207078 (2002)