1.
Y. Abbasi-Yadkori, D. Pál and C. Szepesvári, Improved algorithms for linear stochastic bandits, In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 24, 2312-2320. Curran Associates, Inc., 2011. URL https://papers.nips.cc/paper/4417-improved-algorithms-for-linear-stochastic-bandits.
2. D. Agarwal, Computational advertising: The Linkedin way, In Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management, CIKM '13, 1585-1586, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2263-8.
3.
S. Agrawal and N. Goyal, Analysis of Thompson sampling for the multi-armed bandit problem, In Proceedings of the 25th Annual Conference on Learning Theory, PMLR 23: 39.1-39.26, 2012. URL https://proceedings.mlr.press/v23/agrawal12.
4.
S. Agrawal and N. Goyal, Further optimal regret bounds for Thompson sampling, In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, PMLR 31: 99-107, 2013. URL https://proceedings.mlr.press/v31/agrawal13a.html.
5.
S. Agrawal and N. Goyal, Thompson sampling for contextual bandits with linear payoffs, In Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3): 127-135, 2013. URL https://proceedings.mlr.press/v28/agrawal13.html.