Affiliation:
1. Department of Statistics, Chiang Mai University , 50200 Chiang Mai , Thailand
Abstract
Abstract
We study the $k$-nearest neighbour classifier ($k$-NN) of probability measures under the Wasserstein distance. We show that the $k$-NN classifier is not universally consistent on the space of measures supported in $(0,1)$. As any Euclidean ball contains a copy of $(0,1)$, one should not expect to obtain universal consistency without some restriction on the base metric space, or the Wasserstein space itself. To this end, via the notion of $\sigma $-finite metric dimension, we show that the $k$-NN classifier is universally consistent on spaces of discrete measures (and more generally, $\sigma $-finite uniformly discrete measures) with rational mass. In addition, by studying the geodesic structures of the Wasserstein spaces for $p=1$ and $p=2$, we show that the $k$-NN classifier is universally consistent on spaces of measures supported on a finite set, the space of Gaussian measures and spaces of measures with finite wavelet series densities.
Publisher
Oxford University Press (OUP)
Subject
Applied Mathematics,Computational Theory and Mathematics,Numerical Analysis,Statistics and Probability,Analysis
Reference49 articles.
1. Recouvrements, derivation des mesures et dimensions;Assouad;Rev. Mat. Iberoamericana,2006
2. Linear-Complexity Data-Parallel Earth Mover’s Distance Approximations;Atasu,2019
3. Scalable nearest neighbor search for optimal transport;Backurs;Proceedings of the 37th International Conference on Machine Learning,2020
4. On the bures—Wasserstein distance between positive definite matrices;Bhatia;Exposition. Math.,2019