Author:
Chan Jeffrey,Perrone Valerio,Spence Jeffrey P.,Jenkins Paul A.,Mathieson Sara,Song Yun S.
Abstract
AbstractAn explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our frame-work can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art.
Publisher
Cold Spring Harbor Laboratory
Reference39 articles.
1. The coalescent
2. Approximate Bayesian computation in population genetics;Genetics,2002
3. Deep learning for population genetic inference;PLoS Computational Biology,2016
4. C. Guo , G. Pleiss , Y. Sun , and K. Q. Weinberger . On calibration of modern neural networks. arXiv:1706.04599, 2017.
5. Population growth of human Y chromosomes: a study of Y chromosome microsatellites
Cited by
36 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献