StatBreak: Identifying “Lucky” Data Points Through Genetic Algorithms-Reference-Cited by-同舟云学术

StatBreak: Identifying “Lucky” Data Points Through Genetic Algorithms

Published:2020-05-21 Issue:2 Volume:3 Page:216-228
ISSN:2515-2459
Container-title:Advances in Methods and Practices in Psychological Science
language:en
Short-container-title:Advances in Methods and Practices in Psychological Science

Author:

Rosenbusch Hannes¹^ORCID,Hilbert Leon P.²^ORCID,Evans Anthony M.¹^ORCID,Zeelenberg Marcel¹³

Affiliation:

1. Department of Social Psychology, Tilburg University

2. Department of Social, Economic and Organisational Psychology, Leiden University

3. Department of Marketing, VU Amsterdam

Abstract

Sometimes interesting statistical findings are produced by a small number of “lucky” data points within the tested sample. To address this issue, researchers and reviewers are encouraged to investigate outliers and influential data points. Here, we present StatBreak, an easy-to-apply method, based on a genetic algorithm, that identifies the observations that most strongly contributed to a finding (e.g., effect size, model fit, p value, Bayes factor). Within a given sample, StatBreak searches for the largest subsample in which a previously observed pattern is not present or is reduced below a specifiable threshold. Thus, it answers the following question: “Which (and how few) ‘lucky’ cases would need to be excluded from the sample for the data-based conclusion to change?” StatBreak consists of a simple R function and flags the luckiest data points for any form of statistical analysis. Here, we demonstrate the effectiveness of the method with simulated and real data across a range of study designs and analyses. Additionally, we describe StatBreak’s R function and explain how researchers and reviewers can apply the method to the data they are working with.

Publisher

SAGE Publications

Link

http://journals.sagepub.com/doi/pdf/10.1177/2515245920917950

Reference31 articles.

1. Visual Evaluation of Outlier Detection Models

2. Best-Practice Recommendations for Defining, Identifying, and Handling Outliers

3. Predicting the replicability of social science lab experiments

4. Outlier Removal and the Relation with Reporting Errors and Quality of Psychological Research

5. Outlier removal, sum scores, and the inflation of the type I error rate in independent samples t tests: The power of alternatives and recommendations.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. References;Introduction to Robust Estimation and Hypothesis Testing;2022

2. Estimating Measures of Location and Scale;Introduction to Robust Estimation and Hypothesis Testing;2022