Abstract
Two-sample testing is a fundamental problem in statistics. While many powerful nonparametric methods exist for both the univariate and multivariate context, it is comparatively less common to see a framework for determining which data features lead to rejection of the null. In this paper, we propose a new nonparametric two-sample test named AUGUST, which incorporates a framework for interpretation while maintaining power comparable to existing methods. AUGUST tests for inequality in distribution up to a predetermined resolution using symmetry statistics from binary expansion. Designed for univariate and low to moderate-dimensional multivariate data, this construction allows us to understand distributional differences as a combination of fundamental orthogonal signals. Asymptotic theory for the test statistic facilitates p-value computation and power analysis, and an efficient algorithm enables computation on large data sets. In empirical studies, we show that our test has power comparable to that of popular existing methods, as well as greater power in some circumstances. We illustrate the interpretability of our method using NBA shooting data.
Publisher
New England Statistical Society
Reference46 articles.
1. Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes;The Annals of Mathematical Statistics,1952
2. New test for the multivariate two-sample problem based on the concept of minimum energy;Journal of Statistical Computation and Simulation,2005
3. A nonparametric test for the general two-sample problem;Biometrics,1998
4. A general asymptotic framework for distribution-free graph-based two-sample tests;Journal of the Royal Statistical Society: Series B (Statistical Methodology),2019