Author:
Daw Elbait Gihan,Henschel Andreas,Tay Guan K.,Al Safar Habiba S.
Abstract
The ethnic composition of the population of a country contributes to the uniqueness of each national DNA sequencing project and, ideally, individual reference genomes are required to reduce the confounding nature of ethnic bias. This work represents a representative Whole Genome Sequencing effort of an understudied population. Specifically, high coverage consensus sequences from 120 whole genomes and 33 whole exomes were used to construct the first ever population specific major allele reference genome for the United Arab Emirates (UAE). When this was applied and compared to the archetype hg19 reference, assembly of local Emirati genomes was reduced by ∼19% (i.e., some 1 million fewer calls). In compiling the United Arab Emirates Reference Genome (UAERG), sets of annotated 23,038,090 short (novel: 1,790,171) and 137,713 structural (novel: 8,462) variants; their allele frequencies (AFs) and distribution across the genome were identified. Population-specific genetic characteristics including loss-of-function variants, admixture, and ancestral haplogroup distribution were identified and reported here. We also detect a strong correlation between FST and admixture components in the UAE. This baseline study was conceived to establish a high-quality reference genome and a genetic variations resource to enable the development of regional population specific initiatives and thus inform the application of population studies and precision medicine in the UAE.
Funder
Khalifa University of Science, Technology and Research
Subject
Genetics (clinical),Genetics,Molecular Medicine
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献