Abstract
AbstractWhole genome sequencing (WGS), whole exome sequencing (WES), and array genotyping with imputation (IMP) are common strategies for assessing genetic variation and its association with medically relevant phenotypes. To date there has been no systematic empirical assessment of the yield of these approaches when applied to 100,000s of samples to enable discovery of complex trait genetic signals. Using data for 100 complex traits in 149,195 individuals in the UK Biobank, we systematically compare the relative yield of these strategies in genetic association studies. We find that WGS and WES combined with arrays and imputation (WES+IMP) have the largest association yield. While WGS results in a ∼5-fold increase in the total number of assayed variants over WES+IMP, the number of detected signals differed by only 1% for both single-variant and gene-based association analyses. Since WES+IMP typically results in savings of lab and computational time and resources expended per sample, we evaluate the potential benefits of applying WES+IMP to larger samples. When we extend our WES+IMP analyses to 468,169 UK Biobank individuals, we observe a ∼4-fold increase in association signals with the ∼3-fold increase in sample size. We conclude that prioritizing WES+IMP and large sample sizes, rather than current short-read WGS alternatives, will maximize the number of discoveries in genetic association studies.
Publisher
Cold Spring Harbor Laboratory
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献