Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes


Bi Wenjian,Zhou Wei,Dey Rounak,Mukherjee Bhramar,Sampson Joshua N,Lee Seunggeun


AbstractIn genome-wide association studies (GWAS), ordinal categorical phenotypes are widely used to measure human behaviors, satisfaction, and preferences. However, due to the lack of analysis tools, methods designed for binary and quantitative traits have often been used inappropriately to analyze categorical phenotypes, which produces inflated type I error rates or is less powerful. To accurately model the dependence of an ordinal categorical phenotype on covariates, we propose an efficient mixed model association test, Proportional Odds Logistic Mixed Model (POLMM). POLMM is demonstrated to be computationally efficient to analyze large datasets with hundreds of thousands of genetic related samples, can control type I error rates at a stringent significance level regardless of the phenotypic distribution, and is more powerful than other alternative methods. We applied POLMM to 258 ordinal categorical phenotypes on array-genotypes and imputed samples from 408,961 individuals in UK Biobank. In total, we identified 5,885 genome-wide significant variants, of which 424 variants (7.2%) are rare variants with MAF < 0.01.


Cold Spring Harbor Laboratory

Reference29 articles.

1. Beesley, L.J. et al. The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities. Statistics in Medicine (2019).

2. Exploring and visualizing large-scale genetic associations by using PheWeb;Nature Genetics,2020

3. Biological and clinical insights from genetics of insomnia symptoms;Nature genetics,2019

4. Agresti, A. Categorical data analysis, (John Wiley & Sons, 2003).

5. The UK Biobank resource with deep phenotyping and genomic data








Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3