Affiliation:
1. AT&T Labs Research, Florham Park, NJ
2. University of Munich, Germany
Abstract
In this paper, we propose the Pyramid-Technique, a new indexing method for high-dimensional data spaces. The Pyramid-Technique is highly adapted to range query processing using the maximum metric L
max
. In contrast to all other index structures, the performance of the Pyramid-Technique does not deteriorate when processing range queries on data of higher dimensionality. The Pyramid-Technique is based on a special partitioning strategy which is optimized for high-dimensional data. The basic idea is to divide the data space first into 2d pyramids sharing the center point of the space as a top. In a second step, the single pyramids are cut into slices parallel to the basis of the pyramid. These slices from the data pages. Furthermore, we show that this partition provides a mapping from the given d-dimensional space to a 1-dimensional space. Therefore, we are able to use a B+-tree to manage the transformed data. As an analytical evaluation of our technique for hypercube range queries and uniform data distribution shows, the Pyramid-Technique clearly outperforms index structures using other partitioning strategies. To demonstrate the practical relevance of our technique, we experimentally compared the Pyramid-Technique with the X-tree, the Hilbert R-tree, and the Linear Scan. The results of our experiments using both, synthetic and real data, demonstrate that the Pyramid-Technique outperforms the X-tree and the Hilbert R-tree by a factor of up to 14 (number of page accesses) and up to 2500 (total elapsed time) for range queries.
Publisher
Association for Computing Machinery (ACM)
Subject
Information Systems,Software
Reference23 articles.
1. Organization and maintenance of large ordered indexes
2. Fast parallel similarity search in multimedia databases
3. Berchtold S. B0hm C. Keim D. Kriegel H.-P. Xu X.:'Optimal Multidimensional Query Processing Using Tree Striping' submitted. Berchtold S. B0hm C. Keim D. Kriegel H.-P. Xu X.:'Optimal Multidimensional Query Processing Using Tree Striping' submitted.
4. A cost model for nearest neighbor search in high-dimensional data space
Cited by
104 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献