Leveraging change point detection to discover natural experiments in data-Reference-Cited by-同舟云学术

Leveraging change point detection to discover natural experiments in data

Published:2022-09-03 Issue:1 Volume:11 Page:
ISSN:2193-1127
Container-title:EPJ Data Science
language:en
Short-container-title:EPJ Data Sci.

Author:

He Yuzi^ORCID,Burghardt Keith A.,Lerman Kristina

Abstract

AbstractChange point detection has many practical applications, from anomaly detection in data to scene changes in robotics; however, finding changes in high dimensional data is an ongoing challenge. We describe a self-training model-agnostic framework to detect changes in arbitrarily complex data. The method consists of two steps. First, it labels data as before or after a candidate change point and trains a classifier to predict these labels. The accuracy of this classifier varies for different candidate change points. By modeling the accuracy change we can infer the true change point and fraction of data affected by the change (a proxy for detection confidence). We demonstrate how our framework can achieve low bias over a wide range of conditions and detect changes in high dimensional, noisy data more accurately than alternative methods. We use the framework to identify changes in real-world data and measure their effects using regression discontinuity designs, thereby uncovering potential natural experiments, such as the effect of pandemic lockdowns on air pollution and the effect of policy changes on performance and persistence in a learning platform. Our method opens new avenues for data-driven discovery due to its flexibility, accuracy and robustness in identifying changes in data.

Funder

Defense Advanced Research Projects Agency

Publisher

Springer Science and Business Media LLC

Subject

Computational Mathematics,Computer Science Applications,Modeling and Simulation

Link

https://link.springer.com/content/pdf/10.1140/epjds/s13688-022-00361-7.pdf

Reference42 articles.

1. Lazer D, Pentland A, Adamic L, Aral S, Barabasi A-L, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M et al. (2009) Social science. Computational social science. Science 323:721–723

2. Pearl J (2009) Causal inference in statistics: an overview. Stat Surv 3:96–146

3. Athey S, Imbens G (2016) Recursive partitioning for heterogeneous causal effects. Proc Natl Acad Sci 113(27):7353–7360. [Online]. Available. https://www.pnas.org/content/113/27/7353

4. Künzel SR, Sekhon JS, Bickel PJ, Yu B (2019) Metalearners for estimating heterogeneous treatment effects using machine learning. Proc Natl Acad Sci 116(10):4156–4165

5. Bryan CJ, Tipton E, Yeager DS (2021) Behavioural science is unlikely to change the world without a heterogeneity revolution. Nat Hum Behav 5(8):980–989

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Uncovering Steady State Executions in Java Microbenchmarking with Call Graph Analysis;Companion of the 2023 ACM/SPEC International Conference on Performance Engineering;2023-04-15