Author:
Gurumurthy Bala,Bidarkar Vasudev Raghavendra,Broneske David,Pionteck Thilo,Saake Gunter
Abstract
AbstractQuerying in isolation lacks the potential of reusing intermediate results, which ends up wasting computational resources. Multi-Query Optimization (MQO) addresses this challenge by devising a shared execution strategy across queries, with two generally used strategies: batched or cached. These strategies are shown to improve performance, but hardly any study explores the combination of both. In this work we explore such a hybrid MQO, combining batching (Shared Sub-Expression) and caching (Materialized View Reuse) techniques. Our hybrid-MQO system merges batched query results as well as caches the intermediate results, thereby any new query is given a path within the previous plan as well as reusing the results. Since caching is a key component for improving performance, we measure the impact of common caching techniques such as FIFO, LRU, MRU and LFU. Our results show LRU to be the optimal for our usecase, which we use in our subsequent evaluations. To study the influence of batching, we vary the factor - - which represents the similarity of the results within a query batch. Similarly, we vary the cache sizes to study the influence of caching. Moreover, we also study the role of different database operators in the performance of our hybrid system. The results suggest that, depending on the individual operators, our hybrid method gains a speed-up between 4x to a slowdown of 2x from using MQO techniques in isolation. Furthermore, our results show that workloads with a generously sized cache that contain similar queries benefit from using our hybrid method, with an observed speed-up of 2x over sequential execution in the best case.
Funder
Deutsche Forschungsgemeinschaft
Publisher
Springer Science and Business Media LLC
Reference20 articles.
1. Bachhav, A., Kharat, V., & Shelar, M. (2021). An efficient query optimizer with materialized intermediate views in distributed and cloud environment. In Tehnički glasnik
2. Broneske, D., Köppen, V., Saake, G., & Schäler, M. (2018). Efficient evaluation of multi-column selection predicates in main-memory. IEEE Transactions on Knowledge and Data Engineering, 31(7), 1296–1311.
3. Begoli, E., Rodríguez, J. C., Hyde, J., Mior, M. J., & Lemire, D. (2018). Apache calcite: A foundational framework for optimized query processing over heterogeneous data sources. In Proceedings of ICMD
4. Chaudhuri, S., Krishnamurthy, R., Potamianos, S., & Shim, K. (1995). Optimizing queries with materialized views. In Proceedings of ICDE
5. Dursun, K., Binnig, C., Cetintemel, U., & Kraska, T. (2017). Revisiting reuse in main memory database systems. In Proceedings of ACM SIGMOD