Self-spacial join selectivity estimation using fractal concepts

Author:

Belussi Alberto1,Faloutsos Christos2

Affiliation:

1. Politecnico di Milano

2. University of Maryland

Abstract

The problem of selectivity estimation for queries of nontraditional databases is still an open issue. In this article, we examine the problem of selectivity estimation for some types of spatial queries in databases containing real data . We have shown earlier [Faloutsos and Kamel 1994] that real point sets typically have a nonuniform distribution, violating consistently the uniformity and independence assumptions. Moreover, we demonstrated that the theory of fractals can help to describe real point sets. In this article we show how the concept of fractal dimension, i.e., (noninteger) dimension, can lead to the solution for the selectivity estimation problem in spatial databases. Among the infinite family of fractal dimensions, we consider here the Hausdorff fractal dimension D 0 and the “Correlation” fractal dimension D 2 . Specifically, we show that (a) the average number of neighbors for a given point set follows a power law, with D 2 as exponent, and (b) the average number of nonempty range queries follows a power law with E − D 0 as exponent ( E is the dimension of the embedding space). We present the formulas to estimate the selectivity for “biased” range queries, for self-spatial joins, and for the average number of nonempty range queries. The result of some experiments on real and synthetic point sets are shown. Our formulas achieve very low relative errors, typically about 10%, versus 40%–100% of the formulas that are based on the uniformity and independence assumptions.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Reference34 articles.

1. Mining association rules between sets of items in large databases

2. Qbism: A prototype 3-d medial imaging database system;ARYA M.;IEEE Data Eng. Tech. Bull.,1993

3. A better way to compress images;BARNSLEY M. F.;BYTE,1988

Cited by 21 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Identification of 4FGL Uncertain Sources at Higher Resolutions with Inverse Discrete Wavelet Transform;The Astrophysical Journal;2024-01-01

2. A Learned Query Optimizer for Spatial Join;Proceedings of the 29th International Conference on Advances in Geographic Information Systems;2021-11-02

3. SWARM: Adaptive Load Balancing in Distributed Streaming Systems for Big Spatial Data;ACM Transactions on Spatial Algorithms and Systems;2021-06-07

4. Using Deep Learning for Big Spatial Data Partitioning;ACM Transactions on Spatial Algorithms and Systems;2021-03-31

5. Effective and efficient top-k query processing over incomplete data streams;Information Sciences;2021-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3