Abstract
k nearest neighbours (kNN) queries are fundamental in many applications, ranging from data mining, recommendation system and Internet of Things, to Industry 4.0 framework applications. In mining, specifically, it can be used for the classification of human activities, iterative closest point registration and pattern recognition and has also been helpful for intrusion detection systems and fault detection. Due to the importance of kNN queries, many algorithms have been proposed in the literature, for both static and dynamic data. In this paper, we focus on exact kNN queries and present a comprehensive survey of exact kNN queries. In particular, we study two fundamental types of exact kNN queries: the kNN Search queries and the kNN Join queries. Our survey focuses on exact approaches over high-dimensional data space, which covers 20 kNN Search methods and 9 kNN Join methods. To the best of our knowledge, this is the first work of a comprehensive survey of exact kNN queries over high-dimensional datasets. We specifically categorise the algorithms based on indexing strategies, data and space partitioning strategies, clustering techniques and the computing paradigm. We provide useful insights for the evolution of approaches based on the various categorisation factors, as well as the possibility of further expansion. Lastly, we discuss some open challenges and future research directions.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference189 articles.
1. Near-Optimal hashing algorithms for approximate nearest neighbour in high dimensions;Andoni;Commun. ACM,2008
2. Bawa, M., Condie, T., and Ganesan, P. (2005, January 10–14). LSH forest: Self-tuning indexes for similarity search. Proceedings of the 14th international conference on World Wide Web, Chiba, Japan.
3. Lv, Q., Josephson, W., Wang, Z., Charikar, M., and Li, K. (2007, January 23–27). Multi-probe LSH: Efficient indexing for high-dimensional similarity search. Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria.
4. Product quantization for nearest neighbour search;Jegou;IEEE Trans. Pattern Anal. Mach. Intell.,2010
5. A new cell-level search based non-exhaustive approximate nearest neighbour (ANN) search algorithm in the framework of product quantization;Wang;IEEE Access,2019
Cited by
32 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献