Affiliation:
1. School of Computing Science, Simon Fraser University, B.C., Canada
2. Department of Computer Science, Wright State University, Dayton, OH
Abstract
It is often too expensive to compute and materialize a complete high-dimensional data cube. Computing an iceberg cube, which contains only aggregates above certain thresholds, is an effective way to derive nontrivial multi-dimensional aggregations for OLAP and data mining.
In this paper, we study efficient methods for computing iceberg cubes with some popularly used complex measures, such as
average
, and develop a methodology that adopts a weaker but anti-monotonic condition for testing and pruning search space. In particular, for efficient computation of iceberg cubes with the
average
measure, we propose a
top-k average
pruning method and extend two previously studied methods, Apriori and BUC, to Top-
k
Apriori and Top-
k
BUC. To further improve the performance, an interesting hypertree structure, called H-tree, is designed and a new iceberg cubing method, called Top-
k
H-Cubing, is developed. Our performance study shows that Top-
k
BUC and Top-
k
H-Cubing are two promising candidates for scalable computation, and Top-
k
H-Cubing has better performance in most cases.
Publisher
Association for Computing Machinery (ACM)
Subject
Information Systems,Software
Reference16 articles.
1. S. Agarwal R. Agrawal P. M. Deshpande A. Gupta J. F. Naughton R. Ramakrishnan and S. Sarawagi. On the computation of multidimensional aggregates. VLDB'96. S. Agarwal R. Agrawal P. M. Deshpande A. Gupta J. F. Naughton R. Ramakrishnan and S. Sarawagi. On the computation of multidimensional aggregates. VLDB'96.
2. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB'94. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB'94.
3. Bottom-up computation of sparse and Iceberg CUBE
4. An overview of data warehousing and OLAP technology
Cited by
45 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献