Affiliation:
1. Department of Epidemiology and Biostatistics Memorial Sloan‐Kettering Cancer Center New York New York
2. Division of Cancer Epidemiology and Genetics National Cancer Institute, National Institutes of Health Rockville Maryland
Abstract
Our work was motivated by the question whether, and to what extent, well‐established risk factors mediate the racial disparity observed for colorectal cancer (CRC) incidence in the United States. Mediation analysis examines the relationships between an exposure, a mediator and an outcome. All available methods require access to a single complete data set with these three variables. However, because population‐based studies usually include few non‐White participants, these approaches have limited utility in answering our motivating question. Recently, we developed novel methods to integrate several data sets with incomplete information for mediation analysis. These methods have two limitations: (i) they only consider a single mediator and (ii) they require a data set containing individual‐level data on the mediator and exposure (and possibly confounders) obtained by independent and identically distributed sampling from the target population. Here, we propose a new method for mediation analysis with several different data sets that accommodates complex survey and registry data, and allows for multiple mediators. The proposed approach yields unbiased causal effects estimates and confidence intervals with nominal coverage in simulations. We apply our method to data from U.S. cancer registries, a U.S.‐population‐representative survey and summary level odds‐ratio estimates, to rigorously evaluate what proportion of the difference in CRC risk between non‐Hispanic Whites and Blacks is mediated by three potentially modifiable risk factors (CRC screening history, body mass index, and regular aspirin use).
Funder
National Cancer Institute