On nonmetric similarity search problems in complex domains-Reference-Cited by-同舟云学术

On nonmetric similarity search problems in complex domains

Published:2011-10 Issue:4 Volume:43 Page:1-50
ISSN:0360-0300
Container-title:ACM Computing Surveys
language:en
Short-container-title:ACM Comput. Surv.

Author:

Skopal Tomáš¹,Bustos Benjamin²

Affiliation:

1. Charles University in Prague, Czech Republic

2. University of Chile, Chile

Abstract

The task of similarity search is widely used in various areas of computing, including multimedia databases, data mining, bioinformatics, social networks, etc. In fact, retrieval of semantically unstructured data entities requires a form of aggregated qualification that selects entities relevant to a query. A popular type of such a mechanism is similarity querying. For a long time, the database-oriented applications of similarity search employed the definition of similarity restricted to metric distances. Due to its topological properties, metric similarity can be effectively used to index a database which can then be queried efficiently by so-called metric access methods. However, together with the increasing complexity of data entities across various domains, in recent years there appeared many similarities that were not metrics—we call them nonmetric similarity functions. In this article we survey domains employing nonmetric functions for effective similarity search, and methods for efficient nonmetric similarity search. First, we show that the ongoing research in many of these domains requires complex representations of data entities. Simultaneously, such complex representations allow us to model also complex and computationally expensive similarity functions (often represented by various matching algorithms). However, the more complex similarity function one develops, the more likely it will be a nonmetric. Second, we review state-of-the-art techniques for efficient (fast) nonmetric similarity search, concerning both exact and approximate search. Finally, we discuss some open problems and possible future research trends.

Funder

Czech Science Foundation

Fondo Nacional de Desarrollo Científico y Tecnológico

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/1978802.1978813

Reference158 articles.

1. The IGrid index

2. Basic local alignment search tool

Cited by 51 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Filtering with relational similarity;Information Systems;2024-05

2. Worst-Case-Optimal Similarity Joins on Graph Databases;Proceedings of the ACM on Management of Data;2024-03-12

3. Visualizations for universal deep-feature representations: survey and taxonomy;Knowledge and Information Systems;2023-09-16

4. Unconventional application of k-means for distributed approximate similarity search;Information Sciences;2023-01

5. Mathematical Methods for the Shape Analysis and Indexing of Tangible CH Artefacts;Mathematical Modeling in Cultural Heritage;2023