Abstract
AbstractFine-scale genetic structure impacts genetic risk predictions and furthers the understanding of the demography of populations. Current approaches (e.g., PCA, DAPC, t-SNE, and UMAP) either produce coarse and ambiguous cluster divisions or fail to preserve the correct genetic distance between populations. We proposed a new machine learning algorithm named ALFDA. ALFDA considers both local and global genetic affinity between individuals and also preserves the multimodal structure within populations. ALFDA outperformed the existing approaches in identifying fine-scale genetic structure and in retaining population geogenetic distance, providing a valuable tool for geographic ancestry inference as well as correction for spatial stratification in population health studies.
Publisher
Cold Spring Harbor Laboratory
Reference67 articles.
1. Recent advances in the study of fine-scale population structure in humans;Current Opinion in Genetics & Development,2016
2. The fine-scale genetic structure of the British population
3. The impact of a fine-scale population stratification on rare variant association test results
4. Dimensionality reduction reveals fine-scale structure in the Japanese population with consequences for polygenic risk prediction;Nature communications,2020
5. Vendrami DL , Telesca L , Weigand H , Weiss M , Fawcett K , Lehman K , Clark MS , Leese F , McMinn C , Moore H : RAD sequencing resolves fine-scale population structure in a benthic invertebrate: implications for understanding phenotypic plasticity. Open Science 2017, 4:160548.