Affiliation:
1. King Digital Entertainment Ltd, Stockholm, Sweden
2. Aarhus University, Aarhus, Denmark
3. ISI Foundation, Turin, Italy
4. Amherst College, Amherst, MA, USA
Abstract
“Perhaps he could dance first and think afterwards, if it isn’t too much to ask him.”
S. Beckett,
Waiting for Godot
Given a labeled graph, the collection of
-vertex induced connected subgraph patterns that appear in the graph more frequently than a user-specified minimum threshold provides a compact summary of the characteristics of the graph, and finds applications ranging from biology to network science. However, finding these patterns is challenging, even more so for dynamic graphs that evolve over time, due to the streaming nature of the input and the exponential time complexity of the problem.
We study this task in both incremental and fully-dynamic streaming settings, where arbitrary edges can be added or removed from the graph. We present
TipTap
, a suite of algorithms to compute high-quality approximations of the frequent
-vertex subgraphs w.r.t. a given threshold, at any time (i.e., point of the stream), with high probability. In contrast to existing state-of-the-art solutions that require iterating over the entire set of subgraphs in the vicinity of the updated edge,
TipTap
operates by efficiently maintaining a uniform sample of connected
-vertex subgraphs, thanks to an optimized neighborhood-exploration procedure. We provide a theoretical analysis of the proposed algorithms in terms of their unbiasedness and of the sample size needed to obtain a desired approximation quality. Our analysis relies on sample-complexity bounds that use Vapnik–Chervonenkis dimension, a key concept from statistical learning theory, which allows us to derive a sufficient sample size that is independent from the size of the graph. The results of our empirical evaluation demonstrates that
TipTap
returns high-quality results more efficiently and accurately than existing baselines.
Funder
Academy of Finland projects
EC H2020 RIA project “SoBigData”
National Science Foundation project
Publisher
Association for Computing Machinery (ACM)
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献