The Moments Method for Approximate Data Cube Queries-Reference-Cited by-同舟云学术

The Moments Method for Approximate Data Cube Queries

Published:2024-05-10 Issue:2 Volume:2 Page:1-23
ISSN:2836-6573
Container-title:Proceedings of the ACM on Management of Data
language:en
Short-container-title:Proc. ACM Manag. Data

Author:

Lindner Peter¹^ORCID,Basil John Sachin¹^ORCID,Koch Christoph¹^ORCID,Suciu Dan²^ORCID

Affiliation:

1. École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, VD, Switzerland

2. University of Washington, Seattle, WA, USA

Abstract

We investigate an approximation algorithm for various aggregate queries on partially materialized data cubes. Data cubes are interpreted as probability distributions, and cuboids from a partial materialization populate the terms of a series expansion of the target query distribution. Unknown terms in the expansion are just assumed to be 0 in order to recover an approximate query result. We identify this method as a variant of related approaches from other fields of science, that is, the Bahadur representation and, more generally, (biased) Fourier expansions of Boolean functions. Existing literature indicates a rich but intricate theoretical landscape. Focusing on the data cube application, we start by investigating worst-case error bounds. We build upon prior work to obtain provably optimal materialization strategies with respect to query workloads. In addition, we propose a new heuristic method governing materialization decisions. Finally, we show that well-approximated queries are guaranteed to have well-approximated roll-ups.

Funder

NSF

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3651147

Reference45 articles.

1. Marc. Aerts, Helena Geys, Geert Molenberghs, and Louise M. Ryan. 2002. Topics in Modelling of Clustered Data. Chapman & Hall/CRC, New York, NY, USA.

2. BlinkDB

3. The high-order Boltzmann machine: learned distribution and topology