Leveraging query logs and machine learning for parametric query optimization-Reference-Cited by-同舟云学术

Leveraging query logs and machine learning for parametric query optimization

Published:2021-11 Issue:3 Volume:15 Page:401-413
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Vaidya Kapil¹,Dutt Anshuman²,Narasayya Vivek²,Chaudhuri Surajit²

Affiliation:

1. MIT

2. Microsoft Research

Abstract

Parametric query optimization (PQO) must address two problems: identify a relatively small number of plans to cache for a parameterized query (populateCache), and efficiently select the best cached plan to use for executing any instance of the parameterized query (getPlan). Our approach decouples these two decisions. We formulate populateCache as an optimization problem with the goal of identifying a set of plans that minimizes the optimizer estimated cost of queries in the log, and present an efficient algorithm. For getPlan, we leverage query logs to train machine learning (ML) models to choose the lowest optimizer-estimated cost plan from the cached plans. We conduct extensive experiments using complex parameterized queries from benchmarks and real workloads. Our algorithm for populateCache achieves low geometric mean sub-optimality (1.2) even for complex queries using relatively few plans, and scales well to large query logs. The mean latency of our ML model based getPlan technique (~ 210μ sec ) is between one to four orders of magnitude faster compared to prior PQO techniques. The mean sub-optimality is low (1.05), and the 95 th percentile sub-optimality (1.3) is between 1.1× and 25× lower compared to prior techniques. Finally, we present an efficient algorithm for getPlan that leverages execution time information in query logs to circumvent inaccuracies of the query optimizer's cost estimates.

Publisher

Association for Computing Machinery (ACM)

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3494124.3494126

Reference43 articles.

1. [n.d.]. https://scikit-learn.org/stable/modules/tree.html. [n.d.]. https://scikit-learn.org/stable/modules/tree.html.

2. [n.d.]. http://www.tpc.org/tpcds/. [n.d.]. http://www.tpc.org/tpcds/.

3. [n.d.]. https://www.microsoft.com/en-us/download/confirmation.aspx?id=52430. [n.d.]. https://www.microsoft.com/en-us/download/confirmation.aspx?id=52430.

4. [n.d.]. Plan Caching in SQL Server. https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/sql-server-plan-cache-object?view=sql-server-ver15. [n.d.]. Plan Caching in SQL Server. https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/sql-server-plan-cache-object?view=sql-server-ver15.

5. [n.d.]. Plan guides in SQL Server. https://docs.microsoft.com/en-us/sql/relational-databases/performance/plan-guides?view=sql-server-ver15. [n.d.]. Plan guides in SQL Server. https://docs.microsoft.com/en-us/sql/relational-databases/performance/plan-guides?view=sql-server-ver15.

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-Actions;Proceedings of the VLDB Endowment;2024-07

2. TRAP: Tailored Robustness Assessment for Index Advisors via Adversarial Perturbation;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

3. Kepler: Robust Learning for Parametric Query Optimization;Proceedings of the ACM on Management of Data;2023-05-26

4. Fine-grained modeling and optimization for intelligent resource management in big data processing;Proceedings of the VLDB Endowment;2022-07