Efficient and generalizable tuning strategies for stochastic gradient MCMC-Reference-Cited by-同舟云学术

Efficient and generalizable tuning strategies for stochastic gradient MCMC

Published:2023-04-04 Issue:3 Volume:33 Page:
ISSN:0960-3174
Container-title:Statistics and Computing
language:en
Short-container-title:Stat Comput

Author:

Coullon Jeremie,South Leah,Nemeth Christopher

Abstract

AbstractStochastic gradient Markov chain Monte Carlo (SGMCMC) is a popular class of algorithms for scalable Bayesian inference. However, these algorithms include hyperparameters such as step size or batch size that influence the accuracy of estimators based on the obtained posterior samples. As a result, these hyperparameters must be tuned by the practitioner and currently no principled and automated way to tune them exists. Standard Markov chain Monte Carlo tuning methods based on acceptance rates cannot be used for SGMCMC, thus requiring alternative tools and diagnostics. We propose a novel bandit-based algorithm that tunes the SGMCMC hyperparameters by minimizing the Stein discrepancy between the true posterior and its Monte Carlo approximation. We provide theoretical results supporting this approach and assess various Stein-based discrepancies. We support our results with experiments on both simulated and real datasets, and find that this method is practical for a wide range of applications.

Publisher

Springer Science and Business Media LLC

Subject

Computational Theory and Mathematics,Statistics, Probability and Uncertainty,Statistics and Probability,Theoretical Computer Science

Link

https://link.springer.com/content/pdf/10.1007/s11222-023-10233-3.pdf

Reference42 articles.

1. Andrieu, C., Thoms, J.: A tutorial on adaptive MCMC. Stat. Comput. 18(4), 343–373 (2008)

2. Audibert, J.Y., Bubeck, S., Munos, R.: Best arm identification in multi-armed bandits. In: COLT, pp. 41–53 (2010)

3. Baker, J., Fearnhead, P., Fox, E.B., et al.: Control variates for stochastic gradient MCMC. Stat. Comput. 29(3), 599–615 (2019)

4. Bingham, E., Chen, J.P., Jankowiak, M.: et al. Pyro: Deep Universal Probabilistic Programming. arXiv preprint arXiv:1810.09538 (2018)

5. Brosse, N., Durmus, A., Moulines, É.: The promises and pitfalls of stochastic gradient Langevin dynamics. In: Advances in Neural Information Processing Systems, pp. 8278–8288 (2018)

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Federated Edge Intelligence and Edge Caching Mechanisms;Information;2023-07-18