Author:
Petersen Anne Helby,Ekstrøm Claus Thorn,Spirtes Peter,Osler Merete
Abstract
Abstract
Life-course epidemiology relies on specifying complex (causal) models that describe how variables interplay over time. Traditionally, such models have been constructed by perusing existing theory and previous studies. By comparing data-driven and theory-driven models, we investigated whether data-driven causal discovery algorithms can help in this process. We focused on a longitudinal data set on a cohort of Danish men (the Metropolit Study, 1953–2017). The theory-driven models were constructed by 2 subject-field experts. The data-driven models were constructed by use of the temporal Peter-Clark (TPC) algorithm. The TPC algorithm utilizes the temporal information embedded in life-course data. We found that the data-driven models recovered some, but not all, causal relationships included in the theory-driven expert models. The data-driven method was especially good at identifying direct causal relationships that the experts had high confidence in. Moreover, in a post hoc assessment, we found that most of the direct causal relationships proposed by the data-driven model but not included in the theory-driven model were plausible. Thus, the data-driven model may propose additional meaningful causal hypotheses that are new or have been overlooked by the experts. In conclusion, data-driven methods can aid causal model construction in life-course epidemiology, and combining both data-driven and theory-driven methods can lead to even stronger models.
Publisher
Oxford University Press (OUP)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献