1. Agrawal, S., & Goyal, N. (2012). Analysis of Thompson sampling for the multi-armed bandit problem. In Conference on learning theory (pp. 1–39).
2. Gambling in a rigged casino: The adversarial multi-armed bandit problem;Auer,1995
3. A sequential Monte Carlo approach to thompson sampling for Bayesian optimization;Bijl,2016
4. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning;Brochu,2010
5. An empirical evaluation of Thompson sampling;Chapelle,2011