Affiliation:
1. University of Waterloo, Waterloo, Canada
2. LIMOS, CNRS, University of Clermont Auvergne, Clermont-Ferrand, France
Abstract
This article presents Sharing User-Defined Aggregate Function (SUDAF), a declarative framework that allows users to write User-defined Aggregate Functions (UDAFs) as mathematical expressions and use them in Structured Query Language statements.
SUDAF
rewrites partial aggregates of UDAFs using built-in aggregate functions and supports efficient dynamic caching and reusing of partial aggregates. Our experiments show that rewriting UDAFs using built-in functions can significantly speed up queries with UDAFs, and the proposed sharing approach can yield up to two orders of magnitude improvement in query execution time. The article studies also an extension of
SUDAF
to support sharing partial results between arbitrary queries with UDAFs. We show a connection with the problem of query rewriting using views and introduce a new class of rewritings, called
SUDAF
rewritings, which enables to use views that have aggregate functions different from the ones used in the input query. We investigate the underlying rewriting-checking and rewriting-existing problem. Our main technical result is a reduction of these problems to, respectively, rewriting-checking and rewriting-existing of the so-called
aggregate candidates
, a class of rewritings that has been deeply investigated in the literature.
Publisher
Association for Computing Machinery (ACM)
Reference109 articles.
1. Selecting and using views to compute aggregate queries
2. Apache Flink. 2023. Retrieved from https://flink.apache.org
3. Apache Hadoop. 2023. Retrieved from https://hadoop.apache.org
4. Apache Hive. 2023. Retrieved from https://hive.apache.org