Fast parallel similarity search in multimedia databases-Reference-Cited by-同舟云学术

Fast parallel similarity search in multimedia databases

Published:1997-06 Issue:2 Volume:26 Page:1-12
ISSN:0163-5808
Container-title:ACM SIGMOD Record
language:en
Short-container-title:SIGMOD Rec.

Author:

Berchtold Stefan¹,Böhm Christian¹,Braunmüller Bernhard¹,Keim Daniel A.¹,Kriegel Hans-Peter¹

Affiliation:

1. University of Munich, Germany

Abstract

Most similarity search techniques map the data objects into some high-dimensional feature space. The similarity search then corresponds to a nearest-neighbor search in the feature space which is computationally very intensive. In this paper, we present a new parallel method for fast nearest-neighbor search in high-dimensional feature spaces. The core problem of designing a parallel nearest-neighbor algorithm is to find an adequate distribution of the data onto the disks. Unfortunately, the known declustering methods to not perform well for high-dimensional nearest-neighbor search. In contrast, our method has been optimized based on the special properties of high-dimensional spaces and therefore provides a near-optimal distribution of the data items among the disks. The basic idea of our data declustering technique is to assign the buckets corresponding to different quadrants of the data space to different disks. We show that our technique - in contrast to other declustering methods - guarantees that all buckets corresponding to neighboring quadrants are assigned to different disks. We evaluate our method using large amounts of real data (up to 40 MBytes) and compare it with the best known data declustering method, the Hilbert curve. Our experiments show that our method provides an almost linear speed-up and a constant scale-up. Additionally, it outperforms the Hilbert approach by a factor of up to 5.

Publisher

Association for Computing Machinery (ACM)

Subject

Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/253262.253263

Reference23 articles.

1. Basic local alignment search tool

2. A cost model for nearest neighbor search in high-dimensional data space

Cited by 37 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Observations on big data concerning storage and security management;AIP Conference Proceedings;2024

2. GRADES: Gradient Descent for Similarity Caching;IEEE/ACM Transactions on Networking;2023-02

3. Developing A Parallel Program For Image Similarity Search Using Hashing Methods;2022 International Conference on Data Analytics for Business and Industry (ICDABI);2022-10-25

4. Efficient parallel processing of high-dimensional spatial kNN queries;Soft Computing;2022-05-02

5. Data Allocation with Neural Similarity Estimation for Data-Intensive Computing;Computational Science – ICCS 2022;2022