Abstract
AbstractThe ancestral recombination graph (ARG) is a graph-like structure that encodes a detailed genealogical history of a set of individuals along the genome. ARGs that are accurately reconstructed from genomic data have several downstream applications, but inference from data sets comprising millions of samples and variants remains computationally challenging. We introduce Threads, a threading-based method that significantly reduces the computational costs of ARG inference while retaining high accuracy. We apply Threads to infer the ARG of 487,409 genomes from the UK Biobank using ∼10 million high-quality imputed variants, reconstructing a detailed genealogical history of the samples while compressing the input genotype data. Additionally, we develop ARG-based imputation strategies that increase genotype imputation accuracy for ultra-rare variants (MAC ≤10) from UK Biobank exome sequencing data by 5-10%. We leverage ARGs inferred by Threads to detect associations with 52 quantitative traits in non-European UK Biobank samples, identifying 22.5% more signals than ARG-Needle. These analyses underscore the value of using computationally efficient genealogical modeling to improve and complement genotype imputation in large-scale genomic studies.
Publisher
Cold Spring Harbor Laboratory
Reference65 articles.
1. Properties of a neutral allele model with intragenic recombination;Theore&cal Popula&on Biology,1983
2. Ancestral Inference from Samples of DNA Sequences with Recombination;Journal of Computa&onal Biology,1996
3. Griffiths, R.C. & Marjoram, P. An ancestral recombination graph. Progress in popula&on gene&cs and human evolu&on, 257–270 (1997).
4. Generating samples under a Wright-Fisher neutral model of genetic variation;Bioinforma&cs,2002
5. Cosi2: an efficient simulator of exact and approximate coalescent with selection;Bioinforma&cs,2014
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献