Affiliation:
1. The Chinese University of Hong Kong, Hong Kong, Hong Kong
2. Hamad Bin Khalifa University, Doha, Qatar
3. University of Melbourne, Melbourne, VIC, Australia
Abstract
A bipartite graph is a graph that consists of two disjoint sets of vertices and only edges between vertices from different vertex sets. In this paper, we study the counting problems of two common types of em motifs in bipartite graphs: (i) butterflies (2x2 bicliques) and (ii) bi-triangles (length-6 cycles). Unlike most of the existing algorithms that aim to obtain exact counts, our goal is to obtain precise enough estimations of these counts in bipartite graphs, as such estimations are already sufficient and of great usefulness in various applications. While there exist approximate algorithms for butterfly counting, these algorithms are mainly based on the techniques designed for general graphs, and hence, they are less effective on bipartite graphs. Not to mention that there is still a lack of study on approximate bi-triangle counting. Motivated by this, we first propose a novel butterfly counting algorithm, called one-sided weighted sampling, which is tailored for bipartite graphs. The basic idea of this algorithm is to estimate the total butterfly count with the number of butterflies containing two randomly sampled vertices from the same side of the two vertex sets. We prove that our estimation is unbiased, and our technique can be further extended (non-trivially) for bi-triangle count estimation. Theoretical analyses under a power-law random bipartite graph model and extensive experiments on multiple large real datasets demonstrate that our proposed approximate counting algorithms can reach high accuracy, yet achieve up to three orders (resp. four orders) of magnitude speed-up over the state-of-the-art exact butterfly (resp. bi-triangle) counting algorithms. Additionally, we present an approximate clustering coefficient estimation framework for bipartite graphs, which shows a similar speed-up over the exact solutions with less than 1% relative error.
Funder
Hong Kong RGC ECS grant
RGC CRF grant
Qatar National Research Fund
NSFC grant
RGC GRF grant
ARC Discovery Early Career Researcher Award
Hong Kong ITC ITF grant
Publisher
Association for Computing Machinery (ACM)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献