Affiliation:
1. Università di Bologna, Bologna, Italy
2. Politecnico di Milano, Milan, Italy
Abstract
Skyline and ranking queries are two popular, alternative ways of discovering interesting data in large datasets. Skyline queries are simple to specify, as they just return the set of all non-dominated tuples, thereby providing an overall view of potentially interesting results. However, they are not equipped with any means to accommodate user preferences or to control the cardinality of the result set. Ranking queries adopt, instead, a specific scoring function to rank tuples, and can easily control the output size. While specifying a scoring function allows one to give different importance to different attributes by means of, e.g., weight parameters, choosing the “right” weights to use is known to be a hard problem.
In this article, we embrace the skyline approach by introducing an original framework able to capture user preferences by means of constraints on the weights used in a scoring function, which is typically much easier than specifying precise weight values. To this end, we introduce the novel concept of
F-dominance
, i.e., dominance with respect to a
family
of scoring functions
F
: a tuple
t
is said to
F
-dominate tuple
s
when
t
is always better than or equal to
s
according to
all
the functions in
F
.
Based on
F
-dominance, we present two
flexible skyline
(F-skyline) operators, both returning a subset of the skyline:
nd
, characterizing the set of non-
F
-dominated tuples;
po
, referring to the tuples that are also potentially optimal, i.e., best according to some function in
F
. While
nd
and
po
coincide and reduce to the traditional skyline when
F
is the family of all monotone scoring functions, their behaviors differ when subsets thereof are considered. We discuss the formal properties of these new operators, show how to implement them efficiently, and evaluate them on both synthetic and real datasets.
Publisher
Association for Computing Machinery (ACM)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献