Affiliation:
1. Department of Statistics, Texas A&M University , College Station, Texas 77843, U.S.A
Abstract
Summary
Motivated by the increasing use of kernel-based metrics for high-dimensional and large-scale data, we study the asymptotic behaviour of kernel two-sample tests when the dimension and sample sizes both diverge to infinity. We focus on the maximum mean discrepancy using an isotropic kernel, which includes maximum mean discrepancy with the Gaussian kernel and the Laplace kernel, and the energy distance as special cases. We derive asymptotic expansions of the kernel two-sample statistics, based on which we establish a central limit theorem under both the null hypothesis and the local and fixed alternatives. The new nonnull central limit theorem results allow us to perform asymptotic exact power analysis, which reveals a delicate interplay between the moment discrepancy that can be detected by the kernel two-sample tests and the dimension-and-sample orders. The asymptotic theory is further corroborated through numerical studies.
Publisher
Oxford University Press (OUP)
Subject
Applied Mathematics,Statistics, Probability and Uncertainty,General Agricultural and Biological Sciences,Agricultural and Biological Sciences (miscellaneous),General Mathematics,Statistics and Probability
Reference33 articles.
1. On the distribution of the two-sample Cramér–von Mises criterion;Anderson,;Ann. Math. Statist.,1962
2. Effect of high dimension: by an example of a two sample problem;Bai,;Statist. Sinica,1996
3. A distribution free version of the Smirnov two sample test in the p-variate case;Bickel,;Ann. Math. Statist.,1969
4. High-dimensional change-point detection using generalized homogeneity metrics;Chakraborty,,2021
5. A new framework for distance and kernel-based metrics in high dimensions;Chakraborty,;Electron. J. Statist.,2021