Visualizing population structure with variational autoencoders-Reference-Cited by-同舟云学术

Visualizing population structure with variational autoencoders

Published:2021-01-01 Issue:1 Volume:11 Page:1-11
ISSN:2160-1836
Container-title:G3 Genes|Genomes|Genetics
language:en
Short-container-title:

Author:

Battey C J¹^ORCID,Coffing Gabrielle C¹^ORCID,Kern Andrew D¹^ORCID

Affiliation:

1. Department of Biology, University of Oregon Institute of Ecology and Evolution, Eugene, Oregon, 97403

Abstract

Abstract Dimensionality reduction is a common tool for visualization and inference of population structure from genotypes, but popular methods either return too many dimensions for easy plotting (PCA) or fail to preserve global geometry (t-SNE and UMAP). Here we explore the utility of variational autoencoders (VAEs)—generative machine learning models in which a pair of neural networks seek to first compress and then recreate the input data—for visualizing population genetic variation. VAEs incorporate nonlinear relationships, allow users to define the dimensionality of the latent space, and in our tests preserve global geometry better than t-SNE and UMAP. Our implementation, which we call popvae, is available as a command-line python program at github.com/kr-colab/popvae. The approach yields latent embeddings that capture subtle aspects of population structure in humans and Anopheles mosquitoes, and can generate artificial genotypes characteristic of a given sample or population.

Funder

NIH

Publisher

Oxford University Press (OUP)

Subject

Genetics(clinical),Genetics,Molecular Biology

Link

http://academic.oup.com/g3journal/article-pdf/11/1/1/36546456/jkaa036.pdf

Reference64 articles.

1. A global reference for human genetic variation;Nature,2015

2. A community-maintained standard library of population genetic models

3. Predicting the landscape of recombination using deep learning;Adrion;Mole Biol Evol,2020

4. Genome variation and population structure among 1142 mosquitoes of the African malaria vector species anopheles gambiae and anopheles coluzzii;Genome Res,2020

Cited by 49 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Tracing the genealogy origin of geographic populations based on genomic variation and deep learning;Molecular Phylogenetics and Evolution;2024-09

2. Latent generative modeling of long genetic sequences with GANs;2024-08-07

3. Pandora: A Tool to Estimate Dimensionality Reduction Stability of Genotype Data;2024-03-15

4. Microgeographic population structuring in a genus of California trapdoor spiders and discovery of an enigmatic new species (Euctenizidae: Promyrmekiaphila korematsui sp. nov.);Ecology and Evolution;2024-03

5. Tree sequences as a general-purpose tool for population genetic inference;2024-02-21