Affiliation:
1. The Chinese University of Hong Kong
Abstract
Counting subgraphs is a fundamental analysis task for online social networks (OSNs). Given the sheer size and restricted access of OSN, efficient computation of subgraph counts is highly challenging. Although a number of algorithms have been proposed to estimate the relative counts of subgraphs in OSNs with restricted access, there are only few works which try to solve a more general problem, i.e., counting subgraph frequencies. In this article, we propose an efficient random walk-based framework to estimate the subgraph counts. Our framework generates samples by leveraging consecutive steps of the random walk as well as by observing neighbors of visited nodes. Using the importance sampling technique, we derive unbiased estimators of the subgraph counts. To make better use of the degree information of visited nodes, we also design improved estimators, which increases the accuracy of the estimation with no additional cost. We conduct extensive experimental evaluation on real-world OSNs to confirm our theoretical claims. The experiment results show that our estimators are unbiased, accurate, efficient, and better than the state-of-the-art algorithms. For the Weibo graph with more than 58 million nodes, our method produces estimate of triangle count with an error less than 5% using only 20,000 sampled nodes. Detailed comparison with the state-of-the-art methods demonstrates that our algorithm is 2--10 times more accurate.
Publisher
Association for Computing Machinery (ACM)
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献