Balancing histogram optimality and practicality for query result size estimation-Reference-Cited by-同舟云学术

Balancing histogram optimality and practicality for query result size estimation

Published:1995-05-22 Issue:2 Volume:24 Page:233-244
ISSN:0163-5808
Container-title:ACM SIGMOD Record
language:en
Short-container-title:SIGMOD Rec.

Author:

Ioannidis Yannis E.¹,Poosala Viswanath¹

Affiliation:

1. Computer Sciences Department, University of Wisconsin, Madison, WI

Abstract

Many current database systems use histograms to approximate the frequency distribution of values in the attributes of relations and based on them estimate query result sizes and access plan costs. In choosing among the various histograms, one has to balance between two conflicting goals: optimality, so that generated estimates have the least error, and practicality, so that histograms can be constructed and maintained efficiently. In this paper, we present both theoretical and experimental results on several issues related to this trade-off. Our overall conclusion is that the most effective approach is to focus on the class of histograms that accurately maintain the frequencies of a few attribute values and assume the uniform distribution for the rest, and choose for each relation the histogram in that class that is optimal for a self-join query.

Publisher

Association for Computing Machinery (ACM)

Subject

Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/568271.223841

Reference23 articles.

1. Adaptive selectivity estimation using query feedback

2. Estimating block transfers and join sizes

3. Implications of certain assumptions in database performance evauation

Cited by 40 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. PairwiseHist: Fast, Accurate and Space-Efficient Approximate Query Processing with Data Compression;Proceedings of the VLDB Endowment;2024-02

2. JoinSketch: A Sketch Algorithm for Accurate and Unbiased Inner-Product Estimation;Proceedings of the ACM on Management of Data;2023-05-26

3. FactorJoin: A New Cardinality Estimation Framework for Join Queries;Proceedings of the ACM on Management of Data;2023-05-26

4. Constructing outlier-free histograms with variable bin-width based on distance minimization;Intelligent Data Analysis;2023-01-30

5. Should We Consider On-Demand Analysis in Scale-Free Networks?;Advances in Intelligent Data Analysis XXI;2023