Abstract
AbstractBipartite graphs are of great importance in many real-world applications. Butterfly, which is a complete $$2 \times 2$$
2
×
2
biclique, plays a key role in bipartite graphs. In this paper, we investigate the problem of efficient counting the number of butterflies. The most advanced techniques are based on enumerating wedges which is the dominant cost of counting butterflies. Nevertheless, the existing algorithms cannot efficiently handle large-scale bipartite graphs. This becomes a bottleneck in large-scale applications. In this paper, instead of the existing layer-priority-based techniques, we propose a vertex-priority-based paradigm $${\mathsf {BFC}}$$
BFC
-$${\mathsf {VP}}$$
VP
to enumerate much fewer wedges; this leads to a significant improvement of the time complexity of the state-of-the-art algorithms. In addition, we present cache-aware strategies to further improve the time efficiency while theoretically retaining the time complexity of $${\mathsf {BFC}}$$
BFC
-$${\mathsf {VP}}$$
VP
. We also show that our proposed techniques can work efficiently in external and parallel contexts. Moreover, we study the butterfly counting problem on batch-dynamic graphs. Specifically, given a bipartite graph G and a batch-update of edges B, we aim to maintain the number of butterflies in G. To tackle this problem, fast vertex-priority-based algorithms are proposed with optimizations for reducing the computation of existing wedges in G. Our extensive empirical studies demonstrate that the proposed techniques significantly outperform the baseline solutions on real datasets.
Funder
University of New South Wales
Publisher
Springer Science and Business Media LLC
Subject
Hardware and Architecture,Information Systems
Reference73 articles.
1. Acar, U.A., Anderson, D., Blelloch, G.E., Dhulipala, L.: Parallel batch-dynamic graph connectivity. In: The 31st ACM Symposium on Parallelism in Algorithms and Architectures, pp. 381–392 (2019)
2. Aggarwal, A., Vitter, J., et al.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)
3. Ailamaki, A., DeWitt, D.J., Hill, M.D., Wood, D.A.: Dbmss on a modern processor: Where does time go? In: PVLDB, number DIAS-CONF-1999-001, pp. 266–277 (1999)
4. Aksoy, S.G., Kolda, T.G., Pinar, A.: Measuring and modeling bipartite graphs with community structure. J. Complex Netw. 5(4), 581–603 (2017)
5. Al Hasan, M., Dave, V.S.: Triangle counting in large networks: a review. Wiley Interdiscipl. Rev. Data Min. Knowl. Discov. 8(2), e1226 (2018)
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献