Rapid and accurate multi-phenotype imputation for millions of individuals-Reference-Cited by-同舟云学术

Rapid and accurate multi-phenotype imputation for millions of individuals

Published:2023-06-26 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Gu Lin-Lin,Wu Hong-Shan,Zhang Yong-Jie,Liu Tian-Yi,He Jing-Cheng,Liu Xiao-Lei,Chen Guo-Bo^ORCID,Jiang Dan,Fang Ming

Abstract

AbstractDeep phenotyping can enhance the power of genetic analysis such as genome-wide association study (GWAS), but recurrence of missing phenotypes compromises the potentials of such resources. Although many phenotypic imputation methods have been developed, accurate imputation for millions of individuals still remains extremely challenging. In the present study, leveraging efficient machine learning (ML)-based algorithms, we developed a novel multi-phenotype imputation method based on mixed fast random forest (PIXANT), which is several orders of magnitude in runtime and computer memory usage than the state-of-the-art methods when applied to the UK Biobank (UKB) data and scalable to cohorts with millions of individuals. Our simulations with hundreds of individuals showed that PIXANT was superior to or comparable to the most advanced methods available in terms of accuracy. We also applied PIXANT to impute 425 phenotypes for the UKB data of 277,301 unrelated white British citizens and performed GWAS on imputed phenotypes, and identified a 15.6% more GWAS loci than before imputation (8,710vs7,355). Due to the increased statistical power of GWAS, a certain proportion of novel genes were rediscovered, such asRNF220,SCN10AandRGS6that affect heart rate, demonstrating the use of imputed phenotype data in a large cohort to discover novel genes for complex traits.

Publisher

Cold Spring Harbor Laboratory

Reference51 articles.

1. Human phenotyping on a population scale;Nat Methods,2015

2. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease

3. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis

4. The UK Biobank resource with deep phenotyping and genomic data

5. Multitrait analysis of glaucoma identifies new risk loci and enables polygenic prediction of disease susceptibility and progression

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. UK BioCoin: Swift Trait-Specific Summary Statistics Regression for UK Biobank;2024-04-15