1. Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert Schapire. 2014. Taming the monster: A fast and simple algorithm for contextual bandits. In Proceedings of the International Conference on Machine Learning. PMLR, 1638–1646.
2. The Nonstochastic Multiarmed Bandit Problem
3. Mohammad Gheshlaghi Azar, Alessandro Lazaric, and Emma Brunskill. 2013. Sequential transfer in multi-armed bandit with finite set of models. In Proceedings of the 26th International Conference on Neural Information Processing Systems. 2220–2228.