Author:
Jam Helyaneh Ziaei,Li Yang,DeVito Ross,Mousavi Nima,Ma Nichole,Lujumba Ibra,Adam Yagoub,Maksimov Mikhail,Huang Bonnie,Dolzhenko Egor,Qiu Yunjiang,Kakembo Fredrick Elishama,Joseph Habi,Onyido Blessing,Adeyemi Jumoke,Bakhtiari Mehrdad,Park Jonghun,Javadzadeh Sara,Jjingo Daudi,Adebiyi Ezekiel,Bafna Vineet,Gymrek Melissa
Abstract
AbstractTandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3,550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
Publisher
Cold Spring Harbor Laboratory
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献