CEDA: Learned Cardinality Estimation with Domain Adaptation-Reference-Cited by-同舟云学术

CEDA: Learned Cardinality Estimation with Domain Adaptation

Published:2023-08 Issue:12 Volume:16 Page:3934-3937
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Wang Zilong¹,Zeng Qixiong¹,Wang Ning¹,Lu Haowen¹,Zhang Yue¹

Affiliation:

1. Beijing Jiaotong University, China

Abstract

Cardinality Estimation (CE) is a fundamental but critical problem in DBMS query optimization, while deep learning techniques have made significant breakthroughs in the research of CE. However, apart from requiring sufficiently large training data to cover all possible query regions for accurate estimation, current query-driven CE methods also suffer from workload drifts. In fact, retraining or fine-tuning needs cardinality labels as ground truth and obtaining the labels through DBMS is also expensive. Therefore, we propose CEDA, a novel domain-adaptive CE system. CEDA can achieve more accurate estimations by automatically generating workloads as training data according to the data distribution in the database, and incorporating histogram information into an attention-based cardinality estimator. To solve the problem of workload drifts in real-world environments, CEDA adopts a domain adaptation strategy, making the model more robust and perform well on an unlabeled workload with a large difference from the feature distribution of the training set.

Publisher

Association for Computing Machinery (ACM)

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3611540.3611589

Reference9 articles.

1. Generating Queries with Cardinality Constraints for DBMS Testing

2. Selectivity estimation for range predicates using lightweight models

3. Andreas Kipf , Thomas Kipf , Bernhard Radke , Viktor Leis , Peter Boncz , and Alfons Kemper . 2018. Learned cardinalities: Estimating correlated joins with deep learning. arXiv preprint arXiv:1809.00677 ( 2018 ). Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter Boncz, and Alfons Kemper. 2018. Learned cardinalities: Estimating correlated joins with deep learning. arXiv preprint arXiv:1809.00677 (2018).

4. How good are query optimizers, really?

5. Donald R Slutz . 1998 . Massive stochastic testing of SQL . In VLDB , Vol. 98 . Citeseer, 618--622. Donald R Slutz. 1998. Massive stochastic testing of SQL. In VLDB, Vol. 98. Citeseer, 618--622.

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Machine Learning for Databases: Foundations, Paradigms, and Open problems;Companion of the 2024 International Conference on Management of Data;2024-06-09

2. Automating localized learning for cardinality estimation based on XGBoost;Knowledge and Information Systems;2024-06-01

3. Towards Exploratory Query Optimization for Template-Based SQL Workloads;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

4. Refactoring Index Tuning Process with Benefit Estimation;Proceedings of the VLDB Endowment;2024-03