Affiliation:
1. University of Waterloo, Waterloo, ON, Canada
2. Hong Kong Baptist University, Kowloon, Hong Kong
Abstract
We investigate how to efficiently compute the difference result of two (or multiple) conjunctive queries, which is the last operator in relational algebra to be unraveled. The standard approach in practical database systems is to materialize the results for every input query as a separate set, and then compute the difference of two (or multiple) sets. This approach is bottlenecked by the complexity of evaluating every input query individually, which could be very expensive, particularly when there are only a few results in the difference. In this paper, we introduce a new approach by exploiting the structural property of input queries and rewriting the original query by pushing the difference operator down as much as possible. We show that for a large class of difference queries, this approach can lead to a linear-time algorithm, in terms of the input size and (final) output size, i.e., the number of query results that survive from the difference operator. We complete this result by showing the hardness of computing the remaining difference queries in linear time. Although a linear-time algorithm is hard to achieve in general, we also provide some heuristics that can provably improve the standard approach. At last, we compare our approach with standard SQL engines over graph and benchmark datasets. The experiment results demonstrate order-of-magnitude speedups achieved by our approach over the vanilla SQL engine.
Publisher
Association for Computing Machinery (ACM)
Reference38 articles.
1. DuckDB. https://duckdb.org/. DuckDB. https://duckdb.org/.
2. MySQL. https://www.mysql.com/. MySQL. https://www.mysql.com/.
3. Oracle. https://www.oracle.com/. Oracle. https://www.oracle.com/.
4. PostgreSQL. https://www.postgre.org/. PostgreSQL. https://www.postgre.org/.
5. SNAP. https://snap.stanford.edu/snap/. SNAP. https://snap.stanford.edu/snap/.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献