Abstract
Nearest neighbor search (NNS) has a wide range of applications in information retrieval, computer vision, machine learning, databases, and other areas. Existing state-of-the-art algorithm for nearest neighbor search, Hierarchical Navigable Small World Networks (HNSW), is unable to scale to large datasets of 100M records in high dimensions. In this paper, we propose LANNS, an end-to-end platform for Approximate Nearest Neighbor Search, which scales for web-scale datasets. Library for Large Scale Approximate Nearest Neighbor Search (LANNS) is deployed in multiple production systems for identifying top-K (100 ≤ k ≤ 200) approximate nearest neighbors with a latency of a few milliseconds per query, high throughput of ~2.5k Queries Per Second (QPS) on a single node, on large (e.g., ~ 180M data points) high dimensional (50-2048 dimensional) datasets.
Publisher
Association for Computing Machinery (ACM)
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Reference31 articles.
1. Data-dependent hashing via nonlinear spectral gaps
2. Optimal Data-Dependent Hashing for Approximate Near Neighbors
3. HD-index
4. Martin Aumuller , Erik Bernhardsson , and Alexander John Faithfull . 2018. ANN benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms. CoRR abs / 1807 .05614 (2018), 1--20. arXiv:1807.05614 http://arxiv.org/abs/1807.05614 Martin Aumuller, Erik Bernhardsson, and Alexander John Faithfull. 2018. ANN benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms. CoRR abs / 1807.05614 (2018), 1--20. arXiv:1807.05614 http://arxiv.org/abs/1807.05614
5. Data-Dependent Hashing Based on p-Stable Distribution
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献