Neural ADMIXTURE for rapid genomic clustering-Reference-Cited by-同舟云学术

Neural ADMIXTURE for rapid genomic clustering

Published:2023-07-06 Issue:7 Volume:3 Page:621-629
ISSN:2662-8457
Container-title:Nature Computational Science
language:en
Short-container-title:Nat Comput Sci

Author:

Dominguez Mantes Albert^ORCID,Mas Montserrat Daniel,Bustamante Carlos D.,Giró-i-Nieto Xavier,Ioannidis Alexander G.^ORCID

Abstract

AbstractCharacterizing the genetic structure of large cohorts has become increasingly important as genetic studies extend to massive, increasingly diverse biobanks. Popular methods decompose individual genomes into fractional cluster assignments with each cluster representing a vector of DNA variant frequencies. However, with rapidly increasing biobank sizes, these methods have become computationally intractable. Here we present Neural ADMIXTURE, a neural network autoencoder that follows the same modeling assumptions as the current standard algorithm, ADMIXTURE, while reducing the compute time by orders of magnitude surpassing even the fastest alternatives. One month of continuous compute using ADMIXTURE can be reduced to just hours with Neural ADMIXTURE. A multi-head approach allows Neural ADMIXTURE to offer even further acceleration by computing multiple cluster numbers in a single run. Furthermore, the models can be stored, allowing cluster assignment to be performed on new data in linear time without needing to share the training samples.

Funder

U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute

Stanford Institute for Human-Centered Artificial Intelligence

Blond McIndoe Research Foundation

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Computer Science Applications,Computer Science (miscellaneous)

Link

https://www.nature.com/articles/s43588-023-00482-7.pdf

Reference45 articles.

1. Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).