Affiliation:
1. Guangzhou University
2. Guangzhou University, PengCheng Laboratory
3. Shanghai Jiao Tong University
Abstract
There are two fundamental problems in regular simple path queries (RSPQs). One is the reachability problem which asks whether there exists a simple path between the source and the target vertex matching the given regular expression, and the other is the enumeration problem which aims to find all the matched simple paths. As an important computing component of graph databases, RSPQs are supported in many graph database query languages such as PGQL and openCypher. However, answering RSPQs is known to be NP-hard, making it challenging to design scalable solutions to support a wide range of expressions. In this paper, we first introduce the class of
transitive restricted expression
, which covers more than 99% of real-world queries. Then, we propose an efficient algorithm framework to support both reachability and enumeration problems under transitive restricted expression constraints. To boost the performance, we develop novel techniques for reachability detection, the search of candidate vertices, and the reduction of redundant path computation. Extensive experiments demonstrate that our exact method can achieve comparable efficiency to the state-of-the-art approximate approach, and outperforms the state-of-the-art exact methods by up to 2 orders of magnitude.
Publisher
Association for Computing Machinery (ACM)
Reference45 articles.
1. Regular path queries with constraints
2. Abdulellah A. Alsaheel, Yuhong Nan, Shiqing Ma, Le Yu, Gregory Walkup, Z. Berkay Celik, X. Zhang, and Dongyan Xu. 2021. ATLAS: A Sequence-based Learning Approach for Attack Investigation. In USENIX Security Symposium.
3. Foundations of Modern Query Languages for Graph Databases
4. Diego Arroyuelo, Aidan Hogan, Gonzalo Navarro, and Javiel Rojas-Ledesma. 2021. Time-and Space-Efficient Regular Path Queries on Graphs. arXiv preprint arXiv:2111.04556 (2021).
5. A trichotomy for regular simple path queries on graphs