1. M. Mehdi Afsar , Trafford Crump , and Behrouz Far . 2022. Reinforcement Learning Based Recommender Systems: A Survey. ACM Comput. Surv. 55, 7 , Article 145 (dec 2022 ), 38 pages. https://doi.org/10.1145/3543846 10.1145/3543846 M. Mehdi Afsar, Trafford Crump, and Behrouz Far. 2022. Reinforcement Learning Based Recommender Systems: A Survey. ACM Comput. Surv. 55, 7, Article 145 (dec 2022), 38 pages. https://doi.org/10.1145/3543846
2. Shipra Agrawal and Navin Goyal . 2012 . Analysis of thompson sampling for the multi-armed bandit problem . In Conference on learning theory. JMLR Workshop and Conference Proceedings, 39–1. Shipra Agrawal and Navin Goyal. 2012. Analysis of thompson sampling for the multi-armed bandit problem. In Conference on learning theory. JMLR Workshop and Conference Proceedings, 39–1.
3. Shipra Agrawal and Navin Goyal . 2013 . Thompson sampling for contextual bandits with linear payoffs . In International conference on machine learning. PMLR, 127–135 . Shipra Agrawal and Navin Goyal. 2013. Thompson sampling for contextual bandits with linear payoffs. In International conference on machine learning. PMLR, 127–135.
4. Artwork personalization at netflix
5. SIAM journal on computing 32, 1;Auer Peter,2002