Affiliation:
1. Systems Group, ETH Zurich, Switzerland
Abstract
As a result of increases in both the query load and the data managed, as well as changes in hardware architecture (multicore), the last years have seen a shift from query-at-a-time approaches towards shared work (SW) systems where queries are executed in groups. Such groups share operators like scans and joins, leading to systems that process hundreds to thousands of queries in one go.
SW systems range from storage engines that use in-memory co-operative scans to more complex query processing engines that share joins over analytical and star schema queries. In all cases, they rely on either single query optimizers, predicate sharing, or on manually generated plans. In this paper we explore the problem of shared workload optimization (SWO) for SW systems. The challenge in doing so is that the optimization has to be done for the entire workload and that results in a class of stochastic knapsack with uncertain weights optimization, which can only be addressed with heuristics to achieve a reasonable runtime. In this paper we focus on hash joins and shared scans and present a first algorithm capable of optimizing the execution of entire workloads by deriving a global executing plan for all the queries in the system. We evaluate the optimizer over the TPC-W and the TPC-H benchmarks. The results prove the feasibility of this approach and demonstrate the performance gains that can be obtained from SW systems.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
37 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Exploiting Shared Sub-Expression and Materialized View Reuse for Multi-Query Optimization;Information Systems Frontiers;2024-06-25
2. Optimizing Disjunctive Queries with Tagged Execution;Proceedings of the ACM on Management of Data;2024-05-29
3. RTScan: Efficient Scan with Ray Tracing Cores;Proceedings of the VLDB Endowment;2024-02
4. Lemo: A Cache-Enhanced Learned Optimizer for Concurrent Queries;Proceedings of the ACM on Management of Data;2023-12-08
5. tf.data service;Proceedings of the 2023 ACM Symposium on Cloud Computing;2023-10-30