Author:
Noyvert Boris,Erzurumluoglu A Mesut,Drichel Dmitriy,Omland Steffen,Andlauer Till F M,Mueller Stefanie,Sennels Lau,Becker Christian,Kantorovich Aleksandr,Bartholdy Boris A,Brænne Ingrid,Bolivar-Lopez Julio Cesar,Mistrellides Costas,Belbin Gillian M,Li Jeremiah H,Pickrell Joseph K,de Jong Johann,Arora Jatin,Hu Yao,Wood Clive R,Kriegl Jan M,Podduturi Nikhil,Jensen Jan N,Stutzki Jan,Ding Zhihao,
Abstract
AbstractAdvancements in long-read sequencing technology have accelerated the study of large structural variants (SVs). We created a curated, publicly available, multi-ancestry SV imputation panel by long-read sequencing 888 samples from the 1000 Genomes Project. This high-quality panel was used to impute SVs in approximately 500,000 UK Biobank participants. We demonstrated the feasibility of conducting genome-wide SV association studies at biobank scale using 32 disease-relevant phenotypes related to respiratory, cardiometabolic and liver diseases, in addition to 1,463 protein levels. This analysis identified thousands of genome-wide significant SV associations, including hundreds of conditionally independent signals, thereby enabling novel biological insights. Focusing on genetic association studies of lung function as an example, we demonstrate the added value of SVs for prioritising causal genes at gene-rich loci compared to traditional GWAS using only short variants. We envision that future post-GWAS gene-prioritisation workflows will incorporate SV analyses using this SV imputation panel and framework.
Publisher
Cold Spring Harbor Laboratory