Author:
Casel Katrin,Schmid Markus L.
Abstract
A regular path query (RPQ) is a regular expression q that returns all node
pairs (u, v) from a graph database that are connected by an arbitrary path
labelled with a word from L(q). The obvious algorithmic approach to
RPQ-evaluation (called PG-approach), i.e., constructing the product graph
between an NFA for q and the graph database, is appealing due to its simplicity
and also leads to efficient algorithms. However, it is unclear whether the
PG-approach is optimal. We address this question by thoroughly investigating
which upper complexity bounds can be achieved by the PG-approach, and we
complement these with conditional lower bounds (in the sense of the
fine-grained complexity framework). A special focus is put on enumeration and
delay bounds, as well as the data complexity perspective. A main insight is
that we can achieve optimal (or near optimal) algorithms with the PG-approach,
but the delay for enumeration is rather high (linear in the database). We
explore three successful approaches towards enumeration with sub-linear delay:
super-linear preprocessing, approximations of the solution sets, and restricted
classes of RPQs.
Publisher
Centre pour la Communication Scientifique Directe (CCSD)
Subject
General Computer Science,Theoretical Computer Science