Affiliation:
1. University of Ioannina, Greece
2. Aarhus University, Denmark
Abstract
Similarity search in high-dimensional metric spaces is routinely used in many applications including content-based image retrieval, bioinformatics, data mining, and recommender systems. Search can be accelerated by the use of an index. However, constructing a high-dimensional index can be quite expensive and may not pay off if the number of queries against the data is not large. In these circumstances, it is beneficial to construct an index
adaptively
, while responding to a query workload. Existing work on multidimensional adaptive indexing partitions space into orthotopes (i.e., hyperrectangular units). This approach, however, is highly ineffective in high-dimensional spaces. In this paper, we propose AV-tree: an alternative method for adaptive high-dimensional indexing that exploits previously computed distances, using query centers as vantage points. Our experimental study shows that AV-tree yields cumulative cost for the first several hundred or even thousand queries much lower than that of pre-built indices. After thousands of queries, the per-query performance of the AV-tree converges or even surpasses that of the state-of-the-art MVP-tree. Arguably, our approach is commendable in environments where the expected number of queries is not large while there is a need to start answering queries as soon as possible, such as applications where data are updated frequently and past data soon become obsolete.
Publisher
Association for Computing Machinery (ACM)
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献