1. Abbasi-Yadkori, Y., Pál, D., Szepesvári, C.: Improved algorithms for linear stochastic bandits. Adv. Neural Inf. Process. Syst. 24, 2312–2320 (2011)
2. Abe, N., Long, P.M.: Associative reinforcement learning using linear probabilistic concepts. In: Proceedings of the Sixteenth International Conference on Machine Learning, ICML ’99, pp. 3–11. Morgan Kaufmann Publishers Inc., San Francisco (1999)
3. Aggarwal, C.C., et al.: Recommender Systems, vol. 1. Springer (2016)
4. Agrawal, S., Goyal, N.: Analysis of Thompson sampling for the multi-armed bandit problem. In: COLT 2012—The 25th Annual Conference on Learning Theory, June 25–27, 2012, Edinburgh, Scotland, JMLR Proceedings, vol 23. JMLR.org, pp. 39.1–39.26 (2012)
5. Agrawal, S., Goyal, N.: Thompson sampling for contextual bandits with linear payoffs. In: Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, JMLR Workshop and Conference Proceedings, vol 28. JMLR.org, pp. 127–135 (2013)