Violating the normality assumption may be the lesser of two evils-Reference-Cited by-同舟云学术

Violating the normality assumption may be the lesser of two evils

Published:2018-12-20 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Knief Ulrich^ORCID,Forstmeier Wolfgang^ORCID

Abstract

When data are not normally distributed (e.g. skewed, zero-inflated, binomial, or count data) researchers are often uncertain whether it may be legitimate to use tests that assume Gaussian errors (e.g. regression, t-test, ANOVA, Gaussian mixed models), or whether one has to either model a more specific error structure or use randomization techniques.

Here we use Monte Carlo simulations to explore the pros and cons of fitting Gaussian models to non-normal data in terms of risk of type I error, power and utility for parameter estimation.

We find that Gaussian models are remarkably robust to non-normality over a wide range of conditions, meaning that P-values remain fairly reliable except for data with influential outliers judged at strict alpha levels. Gaussian models also perform well in terms of power and they can be useful for parameter estimation but usually not for extrapolation. Transformation of data before analysis is often advisable and visual inspection for outliers and heteroscedasticity is important for assessment. In strong contrast, some non-Gaussian models and randomization techniques bear a range of risks that are often insufficiently known. High rates of false-positive conclusions can arise for instance when overdispersion in count data is not controlled appropriately or when randomization procedures ignore existing non-independencies in the data.

Overall, we argue that violating the normality assumption bears risks that are limited and manageable, while several more sophisticated approaches are relatively error prone and difficult to check during peer review. Hence, as long as scientists and reviewers are not fully aware of the risks, science might benefit from preferentially trusting Gaussian mixed models in which random effects account for non-independencies in the data in a transparent way.

Tweetable abstractGaussian models are remarkably robust to even dramatic violations of the normality assumption.

Publisher

Cold Spring Harbor Laboratory

Reference68 articles.

1. Robustness to nonnormality of regression F-tests;Journal of Econometrics,1996

2. Arnqvist, G. (2020) Mixed models offer no freedom from degrees of freedom. Trends in Ecology & Evolution.

3. Random effects structure for confirmatory hypothesis testing: Keep it maximal

4. Fitting linear mixed-effects models using lme4;Journal of Statistical Software,2015

5. Testing the significance of a correlation with nonnormal data: Comparison of Pearson, Spearman, transformation, and resampling approaches.

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Task performance with touchscreen interfaces under conditions of head-down tilt bed rest;CEAS Space Journal;2023-05-12

2. A new approach to using Diffusive Gradient in Thin-films (DGT) labile concentration for Water Framework Directive chemical status assessment: adaptation of Environmental Quality Standard to DGT for cadmium, nickel and lead;Environmental Sciences Europe;2023-04-29

3. The power of a touch: Regular touchscreen training but not its termination affects hormones and behavior in mice;Frontiers in Behavioral Neuroscience;2023-03-16

4. The forensic implications of camouflaging: a study into victimisation and offending associated with autism and pathological demand avoidance;Advances in Autism;2022-08-16

5. Explaining education-based difference in systematic processing of COVID-19 information: Insights into global recovery from infodemic;Information Processing & Management;2022-07