Hybrid Autoencoder with Orthogonal Latent Space for Robust Population Structure Inference-Reference-Cited by-同舟云学术

Hybrid Autoencoder with Orthogonal Latent Space for Robust Population Structure Inference

Published:2022-06-17 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Yuan Meng,Hoskens Hanne,Goovaerts Seppe,Herrick Noah,Shriver Mark D.,Walsh Susan,Claes Peter^ORCID

Abstract

AbstractBackgroundAnalysis of population structure and genomic ancestry remains an important topic in human genetics and bioinformatics. Commonly used methods require high-quality genotype data to ensure accurate inference. However, in practice, laboratory artifacts and outliers are often present in the data. Moreover, existing methods are typically affected by the presence of related individuals in the dataset.ResultsIn this work, we propose a novel hybrid method, called SAE-IBS, which combines the strengths of traditional matrix decomposition-based (e.g., principal component analysis) and more recent neural network-based (e.g., autoencoders) solutions. I.e., it yields an orthogonal latent space enhancing dimensionality selection while learning non-linear transformations. The proposed approach achieves higher accuracy than existing methods for projecting poor quality target samples (genotyping errors and missing data) onto a reference ancestry space and generates a robust ancestry space in the presence of relatedness.ConclusionWe introduce a new approach and an accompanying open-source program for robust ancestry inference in the presence of missing data, genotyping errors, and relatedness. The obtained ancestry space allows for non-linear projections and exhibits orthogonality with clearly separable population groups.

Publisher

Cold Spring Harbor Laboratory

Reference51 articles.

1. Genetics in geographically structured populations: defining, estimating and interpreting FST

2. Models, methods and tools for ancestry inference and admixture analysis;Quantitative Biology,2017

3. Principal component analysis of genetic data

4. PCA-based population structure inference with generic clustering algorithms

5. Assessing the impact of population stratification on genetic association studies

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploring regional aspects of 3D facial variation within European individuals;Scientific Reports;2023-03-06