1. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards
2. Recommender systems survey
3. Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of Thompson sampling. In Advances in Neural Information Processing Systems. 2249–2257. Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of Thompson sampling. In Advances in Neural Information Processing Systems. 2249–2257.