Pan-African genome demonstrates how population-specific genome graphs improve high-throughput sequencing data analysis-Reference-Cited by-同舟云学术

Pan-African genome demonstrates how population-specific genome graphs improve high-throughput sequencing data analysis

Published:2022-08-04 Issue:1 Volume:13 Page:
ISSN:2041-1723
Container-title:Nature Communications
language:en
Short-container-title:Nat Commun

Author:

Tetikol H. Serhat^ORCID,Turgut Deniz^ORCID,Narci Kubra^ORCID,Budak Gungor^ORCID,Kalay Ozem,Arslan Elif,Demirkaya-Budak Sinem^ORCID,Dolgoborodov Alexey,Kabakci-Zorlu Duygu,Semenyuk Vladimir,Jain Amit,Davis-Dusenbery Brandi N.

Abstract

AbstractGraph-based genome reference representations have seen significant development, motivated by the inadequacy of the current human genome reference to represent the diverse genetic information from different human populations and its inability to maintain the same level of accuracy for non-European ancestries. While there have been many efforts to develop computationally efficient graph-based toolkits for NGS read alignment and variant calling, methods to curate genomic variants and subsequently construct genome graphs remain an understudied problem that inevitably determines the effectiveness of the overall bioinformatics pipeline. In this study, we discuss obstacles encountered during graph construction and propose methods for sample selection based on population diversity, graph augmentation with structural variants and resolution of graph reference ambiguity caused by information overload. Moreover, we present the case for iteratively augmenting tailored genome graphs for targeted populations and demonstrate this approach on the whole-genome samples of African ancestry. Our results show that population-specific graphs, as more representative alternatives to linear or generic graph references, can achieve significantly lower read mapping errors and enhanced variant calling sensitivity, in addition to providing the improvements of joint variant calling without the need of computationally intensive post-processing steps.

Publisher

Springer Science and Business Media LLC

Subject

General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry,Multidisciplinary

Link

https://www.nature.com/articles/s41467-022-31724-3.pdf

Reference48 articles.

1. International Human Genome Sequencing Consortium et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

2. Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).

3. Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).

4. E pluribus unum. Nat. Methods 7, 331 (2010).

5. Ballouz, S., Dobin, A. & Gillis, J. A. Is it time to change the reference genome? Genome Biol. 20, 1–9 (2019).

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unveiling Genomic Complexity: A Framework for Genome Graph Structural Analysis and Optimised Variant Calling Workflows;2024-06-11

2. Pig pangenome graph reveals functional features of non-reference sequences;Journal of Animal Science and Biotechnology;2024-02-22

3. Personalizing medicine in Africa: current state, progress and challenges;Frontiers in Genetics;2023-09-19

4. Accurate human genome analysis with Element Avidity sequencing;2023-08-14

5. Challenges of Diagnosing Mendelian Susceptibility to Mycobacterial Diseases in South Africa;International Journal of Molecular Sciences;2023-07-28