Abstract
SUMMARYThe 1000 Genomes Project (1kGP) is the largest fully open resource of whole genome sequencing (WGS) data consented for public distribution of raw sequence data without access or use restrictions. The final release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low coverage WGS. Here, we present a new, high coverage 3,202-sample WGS 1kGP resource, sequenced to a targeted depth of 30X using the Illumina NovaSeq 6000 system, which now includes 602 complete trios. We performed SNV/INDEL calling against the GRCh38 reference using GATK’s HaplotypeCaller, and generated a comprehensive set of SVs by integrating multiple analytic methods through a sophisticated machine learning model. We make all the data generated as part of this project publicly available and we envision it to become the new de facto public resource for the worldwide genomics and genetics community.
Publisher
Cold Spring Harbor Laboratory
Cited by
127 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献