Affiliation:
1. University of Edinburgh and Harbin Institute of Technology
2. University of Edinburgh
3. University of Edinburgh and UC Santa Barbara
Abstract
In the real world a graph is often fragmented and distributed across different sites. This highlights the need for evaluating queries on distributed graphs. This paper proposes distributed evaluation algorithms for three classes of queries:
reachability
for determining whether one node can reach another,
bounded reachability
for deciding whether there exists a path of a bounded length between a pair of nodes, and
regular reachability
for checking whether there exists a path connecting two nodes such that the node labels on the path form a string in a given regular expression. We develop these algorithms based on
partial evaluation
, to explore parallel computation. When evaluating a query
Q
on a distributed graph
G
, we show that these algorithms possess the following
performance guarantees, no matter how G
is fragmented and distributed: (1) each site is visited
only once
; (2) the total network traffic is determined by the size of
Q
and the fragmentation of
G, independent of
the size of
G
; and (3) the response time is decided by the largest fragment of
G rather than
the entire
G
. In addition, we show that these algorithms can be readily implemented in the MapReduce framework. Using synthetic and real-life data, we experimentally verify that these algorithms are scalable on large graphs, regardless of how the graphs are distributed.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
43 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献