Abstract
Top-k diversity queries over objects embedded in a low-dimensional vector space aim to retrieve the best
k
objects that are both relevant to given user's criteria and well distributed over a designated region. An interesting case is provided by spatial Web objects, which are produced in great quantity by location-based services that let users attach content to places and are found also in domains like trip planning, news analysis, and real estate. In this article we present a technique for addressing such queries that, unlike existing methods for diversified top-
k
queries, does not require accessing and scanning
all
relevant objects in order to find the best
k
results. Our
Space Partitioning and Probing
(SPP) algorithm works by progressively exploring the vector space, while keeping track of the already seen objects and of their relevance and position. The goal is to provide a good quality result set in terms of both relevance and diversity. We assess quality by using as a baseline the result set computed by MMR, one of the most popular diversification algorithms, while minimizing the number of accessed objects. In order to do so, SPP exploits score-based and distance-based access methods, which are available, for instance, in most geo-referenced Web data sources. Experiments with both synthetic and real data show that SPP produces results that are relevant and spatially well distributed, while significantly reducing the number of accessed objects and incurring a very low computational overhead.
Publisher
Association for Computing Machinery (ACM)
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献