Affiliation:
1. UC Berkeley, Soda Hall, Berkeley, CA
Abstract
We show that approximate near neighbor search in high dimensions can be solved in a Las Vegas fashion (i.e., without false negatives) for ℓ
p
(1≤
p
≤ 2) while matching the performance of optimal locality-sensitive hashing. Specifically, we construct a data-independent Las Vegas data structure with query time
O
(
dn
ρ
) and space usage
O
(
dn
1+ρ
) for (
r, c r
)-approximate near neighbors in R
d
under the ℓ
p
norm, where ρ = 1/
c
p
+
o
(1). Furthermore, we give a Las Vegas locality-sensitive filter construction for the unit sphere that can be used with the data-dependent data structure of Andoni et al. (SODA 2017) to achieve optimal space-time tradeoffs in the data-dependent setting. For the symmetric case, this gives us a data-dependent Las Vegas data structure with query time
O
(
dn
ρ
) and space usage
O
(
dn
1+ρ
) for (
r, c r
)-approximate near neighbors in R
d
under the ℓ
p
norm, where ρ = 1/(2
c
p
- 1) +
o
(1).
Our data-independent construction improves on the recent Las Vegas data structure of Ahle (FOCS 2017) for ℓ
p
when 1 <
p
≤ 2. Our data-dependent construction performs even better for ℓ
p
for all pε [1, 2] and is the first Las Vegas approximate near neighbors data structure to make use of data-dependent approaches. We also answer open questions of Indyk (SODA 2000), Pagh (SODA 2016), and Ahle by showing that for approximate near neighbors, Las Vegas data structures can match state-of-the-art Monte Carlo data structures in performance for both the data-independent and data-dependent settings and across space-time tradeoffs.
Funder
Harvard PRISE Fellowship and a Herchel Smith Fellowship
Publisher
Association for Computing Machinery (ACM)
Subject
Mathematics (miscellaneous)