Author:
Staerk Christian,Byrd Alliyah,Mayr Andreas
Abstract
Abstract
Variable selection in regression models is a particularly important issue in epidemiology, where one usually encounters observational studies. In contrast to randomized trials or experiments, confounding is often not controlled by the study design, but has to be accounted for by suitable statistical methods. For instance, when risk factors should be identified with unconfounded effect estimates, multivariable regression techniques can help to adjust for confounders. We investigated the current practice of variable selection in 4 major epidemiologic journals in 2019 and found that the majority of articles used subject-matter knowledge to determine a priori the set of included variables. In comparison with previous reviews from 2008 and 2015, fewer articles applied data-driven variable selection. Furthermore, for most articles the main aim of analysis was hypothesis-driven effect estimation in rather low-dimensional data situations (i.e., large sample size compared with the number of variables). Based on our results, we discuss the role of data-driven variable selection in epidemiology.
Publisher
Oxford University Press (OUP)
Reference59 articles.
1. The history and fate of the gold standard;Jones;Lancet.,2015
2. Invited commentary: variable selection versus shrinkage in the control of multiple confounders;Greenland;Am J Epidemiol.,2008
3. Principles of confounder selection;VanderWeele;Eur J Epidemiol.,2019
4. Five myths about variable selection;Heinze;Transpl Int.,2017
5. Variable selection strategies and its importance in clinical prediction modelling;Chowdhury;Fam Med Community Health.,2020
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献