Abstract
AbstractUnderstanding the genetic basis of complex disease is a critical research goal due to the immense, worldwide burden of these diseases. Pan-biobank genome-wide association studies (GWAS) provide a powerful resource in complex disease genetics, generating shareable summary statistics on thousands of phenotypes. Biobank-scale GWAS have two notable limitations: they are resource-intensive to compute and do not inform about hand-crafted phenotype definitions, which are often more relevant to study. Here we present Indirect GWAS (indGWAS), a summary-statistic-based method that addresses these limitations. IndGWAS computes GWAS statistics for any phenotype defined as a linear combination of other phenotypes. Our method can reduce runtime by an order of magnitude for large pan-biobank GWAS, and it enables ultra-rapid (less than one minute) GWAS on hand-crafted phenotype definitions using only summary statistics. Overall, this method advances complex disease research by facilitating more accessible and cost-effective genetic studies using large observational data.
Publisher
Cold Spring Harbor Laboratory