Abstract
AbstractPolygenic prediction has yet to make a major clinical breakthrough in precision medicine and psychiatry, where the application of polygenic risk scores are expected to improve clinical decision-making. Most widely used approaches for estimating polygenic risk scores are based on summary statistics from external large-scale genome-wide association studies, which relies on assumptions of matching data distributions. This may hinder the impact of polygenic risk scores in modern diverse populations due to small differences in genetic architectures. Reference-free estimators of polygenic scores are instead based on genomic best linear unbiased predictions and models the population of interest directly. We introduce a framework, namedhapla, with a novel algorithm for clustering haplotypes in phased genotype data to estimate heritability and perform reference-free polygenic prediction in complex traits. We utilize inferred haplotype clusters to compute accurate SNP heritability estimates and polygenic scores in a simulation study and the iPSYCH2012 case-cohort for depression disorders and schizophrenia. We demonstrate that our haplotype-based approach robustly outperforms standard genotype-based approaches, which can help pave the way for polygenic risk scores in the future of precision medicine and psychiatry.haplais freely available athttps://github.com/Rosemeis/hapla.
Publisher
Cold Spring Harbor Laboratory