A pangenome reference of 36 Chinese populations-Reference-Cited by-同舟云学术

A pangenome reference of 36 Chinese populations

Published:2023-06-14 Issue:7968 Volume:619 Page:112-121
ISSN:0028-0836
Container-title:Nature
language:en
Short-container-title:Nature

Author:

Gao Yang^ORCID,Yang Xiaofei,Chen Hao,Tan Xinjiang^ORCID,Yang Zhaoqing,Deng Lian,Wang Baonan,Kong Shuang,Li Songyang,Cui Yuhang^ORCID,Lei Chang,Wang Yimin,Pan Yuwen,Ma Sen^ORCID,Sun Hao,Zhao Xiaohan,Shi Yingbing^ORCID,Yang Ziyi,Wu Dongdong,Wu Shaoyuan,Zhao Xingming^ORCID,Shi Binyin,Jin Li^ORCID,Hu Zhibin^ORCID,Mao Chuangxue,Fan Shaohua,Gao Qiang,Dai Juncheng,Bu Fengxiao,He Guanglin,Wu Yang,Yuan Huijun,Li Jinchen,Chen Chao,Yang Jian,Wei Chaochun,Jin Xin,Shen Xia,Lu Yan^ORCID,Chu Jiayou^ORCID,Ye Kai^ORCID,Xu Shuhua^ORCID,

Abstract

AbstractHuman genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65× high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference1. The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping.

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

https://www.nature.com/articles/s41586-023-06173-7.pdf

Reference65 articles.

1. Liao, W.-W. et al. A draft human pangenome reference. Preprint at https://doi.org/10.1101/2022.07.09.499321 (2022).

2. Lou, H. et al. Haplotype-resolved de novo assembly of a Tujia genome suggests the necessity for high-quality population-specific genome references. Cell Syst. 13, 321–333 (2022).

3. Wang, T. et al. The Human Pangenome Project: a global resource to map genomic diversity. Nature 604, 437–446 (2022).

4. Sherman, R. M. & Salzberg, S. L. Pan-genomics in the human genome era. Nat. Rev. Genet. 21, 243–254 (2020).

5. Lu, D. & Xu, S. Principal component analysis reveals the 1000 Genomes Project does not sufficiently cover the human genetic diversity in Asia. Front. Genet. 4, 127 (2013).

Cited by 53 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Medicinal plants used by minority ethnic groups in China: Taxonomic diversity and conservation needs;Journal of Ethnopharmacology;2024-11

2. Copy-number variants differ in frequency across genetic ancestry groups;Human Genetics and Genomics Advances;2024-10

3. Integrated analysis of facial microbiome and skin physio-optical properties unveils cutotype-dependent aging effects;Microbiome;2024-09-05

4. Beyond the Human Genome Project: The Age of Complete Human Genome Sequences and Pangenome References;Annual Review of Genomics and Human Genetics;2024-08-27

5. De Novo Genome Assemblies From Two Indigenous Americans from Arizona Identify New Polymorphisms in Non-Reference Sequences;Genome Biology and Evolution;2024-08-27