Abstract
AbstractBackgroundThe accurate description of ancestry is essential to interpret and integrate human genomics data, and to ensure that advances in the field of genomics benefit individuals from all ancestral backgrounds. However, there are no established guidelines for the consistent, unambiguous and standardized description of ancestry. To fill this gap, we provide a framework, designed for the representation of ancestry in GWAS data, but with wider application to studies and resources involving human subjects.ResultHere we describe our framework and its application to the representation of ancestry data in a widely-used publically available genomics resource, the NHGRI-EBI GWAS Catalog. We present the first analyses of GWAS data using our ancestry categories, demonstrating the validity of the framework to facilitate the tracking of ancestry in big data sets. We exhibit the broader relevance and integration potential of our method by its usage to describe the well-established HapMap and 1000 Genomes reference populations. Finally, to encourage adoption, we outline recommendations for authors to implement when describing samples.ConclusionsWhile the known bias towards inclusion of European ancestry individuals in GWA studies persists, African and Hispanic or Latin American ancestry populations contribute a disproportionately high number of associations, suggesting that analyses including these groups may be more effective at identifying new associations. We believe the widespread adoption of our framework will increase standardization of ancestry data, thus enabling improved analysis, interpretation and integration of human genomics data and furthering our understanding of disease.
Publisher
Cold Spring Harbor Laboratory
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献