Author:
Wei Zhuang,Chang Ching-Wen,Luo Van,Bian Beilei,Ding Xuewei
Abstract
ABSTRACTAn important issue in human population genetics is the ancestry. By extracting the ancestral information retained in the single nucleotide polymorphism (SNP) of genomic DNA, the history of migration and reproduction of the population can be reconstructed. Since the SNP data of population are multidimensional, their dimensionality reduction can demonstrate their potential internal connections. In this study, the graph and structure learning based Graph Embedding method commonly used in single cell mRNA sequencing was applied to human population genetics research to decrease the data dimension. As a result, the human population trajectory of East Asia based on 1000 Genomes Project was reconstructed to discover the inseparable relationship between the Chinese population and other East Asian populations. These results are visualized from various ancestry calculators such as E11 and K12B. Finally, the unique SNPs along the psudotime of trajectory were found by differential analysis. Bioprocess enrichment analysis was also used to reveal that the genes of these SNPs may be related to neurological diseases. These results will lay the data foundation for precision medicine.
Publisher
Cold Spring Harbor Laboratory