Affiliation:
1. Saarland University, Saarland Informatics Campus, Saarbrücken, Germany
Abstract
Join order optimization is one of the most fundamental problems in processing queries on relational data. It has been studied extensively for almost four decades now. Still, because of its NP hardness, no generally efficient solution exists and the problem remains an important topic of research. The scope of algorithms to compute join orders ranges from exhaustive enumeration, to combinatorics based on graph properties, to greedy search, to genetic algorithms, to recently investigated machine learning. A few works exist that use heuristic search to compute join orders. However, a theoretical argument why and how heuristic search is applicable to join order optimization is lacking.
In this work, we investigate join order optimization via heuristic search. In particular, we provide a strong theoretical framework, in which we reduce join order optimization to the shortest path problem. We then thoroughly analyze the properties of this problem and the applicability of heuristic search. We devise crucial optimizations to make heuristic search tractable. We implement join ordering via heuristic search in a real DBMS and conduct an extensive empirical study. Our findings show that for star- and clique-shaped queries, heuristic search finds optimal plans an order of magnitude faster than current state of the art. Our suboptimal solutions further extend the cost/time Pareto frontier.
Publisher
Association for Computing Machinery (ACM)
Reference38 articles.
1. Towards a robust query optimizer
2. Optimizing queries with materialized views
3. On the complexity of generating optimal left-deep processing trees with cross products
4. Thomas H. Cormen , Charles E. Leiserson , Ronald L. Rivest , and Clifford Stein . 2016. Introduction to Algorithms . The MIT Press . Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2016. Introduction to Algorithms. The MIT Press.
5. Generalized best-first search strategies and the optimality of A*