Always Valid Inference: Continuous Monitoring of A/B Tests-Reference-Cited by-同舟云学术

Always Valid Inference: Continuous Monitoring of A/B Tests

Published:2021-08-10 Issue: Volume: Page:
ISSN:0030-364X
Container-title:Operations Research
language:en
Short-container-title:Operations Research

Author:

Johari Ramesh¹^ORCID,Koomen Pete²,Pekelis Leonid³^ORCID,Walsh David⁴^ORCID

Affiliation:

1. Department of Management Science and Engineering, Stanford University, Stanford, California 94305;

2. Optimizely, Inc., San Francisco, California 94105;

3. CloudTrucks, Inc., San Francisco, California 94103;

4. Unlearn.AI, San Francisco, California 94105

Abstract

A/B tests are typically analyzed via frequentist p-values and confidence intervals, but these inferences are wholly unreliable if users endogenously choose samples sizes by continuously monitoring their tests. We define always valid p-values and confidence intervals that let users try to take advantage of data as fast as it becomes available, providing valid statistical inference whenever they make their decision. Always valid inference can be interpreted as a natural interface for a sequential hypothesis test, which empowers users to implement a modified test tailored to them. In particular, we show in an appropriate sense that the measures we develop trade off sample size and power efficiently, despite a lack of prior knowledge of the user’s relative preference between these two goals. We also use always valid p-values to obtain multiple hypothesis testing control in the sequential context. Our methodology has been implemented in a large-scale commercial A/B testing platform to analyze hundreds of thousands of experiments to date.

Publisher

Institute for Operations Research and the Management Sciences (INFORMS)

Subject

Management Science and Operations Research,Computer Science Applications

Reference16 articles.

1. CONFIDENCE SEQUENCES FOR MEAN, VARIANCE, AND MEDIAN

2. Interim analysis: The alpha spending function approach

3. Semiparametric exponential families for heavy-tailed data

Cited by 24 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Pigeonhole Design: Balancing Sequential Experiments from an Online Matching Perspective;Management Science;2024-05-24

2. Generic E-variables for exact sequential k-sample tests that allow for optional stopping;Journal of Statistical Planning and Inference;2024-05

3. Enhancing External Validity in Experiments with Ongoing Sampling;2024

4. Continuous Monitoring of Data in Online Randomized Experiments;2023 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT);2023-11-20

5. Online Regularization toward Always-Valid High-Dimensional Dynamic Pricing;Journal of the American Statistical Association;2023-11-17