1. Marc G Bellemare Sriram Srinivasan Georg Ostrovski Tom Schaul David Saxton and Remi Munos. 2016. Unifying count-based exploration and intrinsic motivation. arXiv preprint arXiv:1606.01868(2016). Marc G Bellemare Sriram Srinivasan Georg Ostrovski Tom Schaul David Saxton and Remi Munos. 2016. Unifying count-based exploration and intrinsic motivation. arXiv preprint arXiv:1606.01868(2016).
2. A new approach to evaluating novel recommendations
3. Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. In Advances in neural information processing systems. 2249–2257. Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. In Advances in neural information processing systems. 2249–2257.
4. Top-K Off-Policy Correction for a REINFORCE Recommender System