1. Yasin Abbasi-Yadkori, Dávid Pál, and Csaba Szepesvári. 2011. Improved algorithms for linear stochastic bandits. In Proceedings of the Advances in Neural Information Processing Systems. 2312–2320.
2. Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert Schapire. 2014. Taming the monster: A fast and simple algorithm for contextual bandits. In Proceedings of the International Conference on Machine Learning. 1638–1646.
3. Alekh Agarwal Haipeng Luo Behnam Neyshabur and Robert E. Schapire. 2017. Corralling a Band of Bandit Algorithms. In Proceedings of Annual Conference on Learning Theory . 12–38.
4. Shipra Agrawal and Navin Goyal. 2013. Thompson sampling for contextual bandits with linear payoffs. In Proceedings of the International Conference on Machine Learning. 127–135.
5. The Nonstochastic Multiarmed Bandit Problem