Abstract
AbstractThere has been rising interest in exploiting data from genome-wide association studies (GWAS) to detect a genetic signature of natural selection acting on a given phenotype. However, current approaches are unable to directly estimate the distribution of fitness effects (DFE), an established property in population genetics that can elucidate genomic architecture pertaining to a particular focal trait. To this end, we introduce ASSESS, an inferential method that exploits the Poisson Random Field (PRF) to model selection coefficients from genome-wide allele count data, while jointly conditioning GWAS summary statistics on a latent distribution of phenotypic effect sizes. This probabilistic model is unified under the assumption of an explicit relationship between fitness and trait effect to yield a DFE. To gauge the performance of ASSESS, we enlisted various simulation experiments that covered a range of usage cases and model misspecifications, which revealed accurate recovery of the underlying selection signal. As a further proof-of-concept, ASSESS was applied to an array of publicly available human trait data, whereby we replicated previously published empirical findings from an alternative methodology. These demonstrations illustrate the potential of ASSESS to satisfy an increasing need for powerful yet convenient population genomic inference from GWAS summary statistics.Author SummaryThe growth of genome-wide association studies (GWAS) over the past decade has provided a wealth of resources for uncovering the genomic architecture underlying complex traits, including the footprint of selection. Currently, there are computational tools for inferring natural selection whereby GWAS results are leveraged to conduct a binary test for overall presence, estimate a correlated property, or summarize polygenic selection strength with a single statistic. However, a methodology that exploits GWAS data to estimate the distribution of fitness effects (DFE), which is the most direct measurement for the genetic impact of natural selection acting on a complex trait, does not currently exist. To this end, we constructed an approach to directly infer the DFE, wherein per-site selection coefficients specifically associated with a focal trait are aggregated across the genome. This implementation is designed to explicitly model an entire genome-wide set of summary statistics output from a GWAS rather than the individual-level input data, which offers computational efficiency and convenience as well as alleviates privacy concerns. We expect this to be a promising development given the further accumulation of GWAS results and investigators seeking more sophisticated analyses into the relationship between genetics and traits.
Publisher
Cold Spring Harbor Laboratory