Abstract
AbstractThe UK Biobank is a large cohort study that recruited over 500,000 British participants aged 40-69 in 2006-2010 at 22 assessment centres from across the UK. Self-reported health outcomes and hospital admission data are two types of records that include participants’ disease status. Coronary artery disease (CAD) is the most common cause of death in the UK Biobank cohort. After distinguishing between prevalence and incidence CAD events for all UK Biobank participants, we identified geographical variations in age-standardised rates of CAD between assessment centres. Significant distributional differences were found between the pooled cohort equation scores of UK Biobank participants from England and Scotland using the Mann-Whitney test. Polygenic risk scores of UK Biobank participants from England and Scotland and from different assessment centres differed significantly using permutation tests. Our aim was to discriminate between assessment centres with different disease rates by collecting data on disease-related risk factors. However, relying solely on individual-level predictions and averaging them to obtain group-level predictions proved ineffective, particularly due to the presence of correlated covariates resulting from participation bias. By using the Mundlak model, which estimates a random effects regression by including the group means of the independent variables in the model, we effectively addressed these issues. In addition, we designed a simulation experiment to demonstrate the functionality of the Mundlak model. Our findings have applications in public health funding and strategy, as our approach can be used to predict case rates in the future, as both population structure and lifestyle changes are uncertain.
Publisher
Cold Spring Harbor Laboratory
Reference64 articles.
1. Alten SV , Domingue BW , Galama T , Marees AT . 2022. Reweighting the UK Biobank to reflect its underlying sampling population substantially reduces pervasive selection bias due to volunteering. Preprint at medRxiv..
2. Aragam KG , Jiang T , Goel A , Kanoni S , Wolford BN , Atri DS , Weeks EM , Wang M , Hindy G , Zhou W et al. 2022. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nature Genetics. pp. 1–13.
3. Association Between Family History and Coronary Heart Disease Death Across Long-Term Follow-Up in Men
4. Explaining Fixed Effects: Random Effects Modeling of Time-Series Cross-Sectional and Panel Data
5. Trends in the epidemiology of cardiovascular disease in the UK