Through the lens of causal inference: Decisions and pitfalls of covariate selection
Author:
Chen GangORCID, Cai Zhengchen, Taylor Paul A.
Abstract
AbstractThe critical importance of justifying the inclusion of covariates is a facet often overlooked in data analysis. While the incorporation of covariates typically follows informal guidelines, we argue for a comprehensive exploration of underlying principles to avoid significant statistical and interpretational challenges. Our focus is on addressing three common yet problematic practices: the indiscriminate lumping of covariates, the lack of rationale for covariate inclusion, and the oversight of potential issues in result reporting. These challenges, prevalent in neuroimaging models involving covariates such as reaction time, demographics, and morphometric measures, can introduce biases, including overestimation, underestimation, masking, sign flipping, or spurious effects.Our exploration of causal inference principles underscores the pivotal role of domain knowledge in guiding co-variate selection, challenging the common reliance on statistical measures. This understanding carries implications for experimental design, model-building, and result interpretation. We draw connections between these insights and reproducibility concerns, specifically addressing the selection bias resulting from the widespread practice of strict thresholding, akin to the logical pitfall associated with “double dipping.” Recommendations for robust data analysis involving covariates encompass explicit research question statements, justified covariate inclusions/exclusions, centering quantitative variables for interpretability, appropriate reporting of effect estimates, and advocating a “highlight, don’t hide” approach in result reporting. These suggestions are intended to enhance the robustness, transparency, and reproducibility of covariate-driven analyses, encompassing investigations involving consortium datasets such as ABCD and UK Biobank. We discuss how researchers can use a transparent depiction of the covariate relationships to enhance the ethos of open science and promote research reproducibility.
Publisher
Cold Spring Harbor Laboratory
Reference52 articles.
1. Confound modelling in UK Biobank brain imaging;NeuroImage,2021 2. A proposal for capturing interaction and effect modification using DAGs 3. Simple Question, Not So Simple Answer: Interpreting Interaction Terms in Moderated Multiple Regression 4. Variability in the analysis of a single neuroimaging dataset by many teams;Nature,2020 5. Cheetham, N.J. , Penfold, R. , Giunchiglia, V. , Bowyer, V. , Sudre, C.H. , Canas, L.S. , Deng, J. , Murray, B. , Kerfoot, E. , Antonelli, M. , Rjoob, K. , Molteni, E. , Österdahl, M.F. , Harvey, N.R. , Trender, W.R. , Malim, M.H. , Doores, K.J. , Hellyer, P.J. , Modat, M. , Hammers, A. , Ourselin, S. , Duncan, E.L. , Hampshire, A. , Steves, C.J. , 2023. The effects of COVID-19 on cognitive performance in a community-based cohort: A COVID symptom study biobank prospective cohort study. eClinicalMedicine 6 2.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|