1. M Mehdi Afsar Trafford Crump and Behrouz Far. 2021. Reinforcement learning based recommender systems: A survey. arXiv:2101.06286 M Mehdi Afsar Trafford Crump and Behrouz Far. 2021. Reinforcement learning based recommender systems: A survey. arXiv:2101.06286
2. Rishabh Agarwal , Dale Schuurmans , and Mohammad Norouzi . 2020 . An Optimistic Perspective on Offline Reinforcement Learning. In ICML 2020 , 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research , Vol. 119). PMLR, 104-- 114 . http://proceedings.mlr.press/v119/agarwal20c.html Rishabh Agarwal, Dale Schuurmans, and Mohammad Norouzi. 2020. An Optimistic Perspective on Offline Reinforcement Learning. In ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 104--114. http://proceedings.mlr.press/v119/agarwal20c.html
3. Gunnar Blom . 1958. Statistical Estimates and Transformed Beta Variables. Almqvist & Wiksell , John Wiley & Sons, Inc. , Sweden . Gunnar Blom. 1958. Statistical Estimates and Transformed Beta Variables. Almqvist & Wiksell, John Wiley & Sons, Inc., Sweden.
4. Jacob Buckman , Carles Gelada , and Marc G . Bellemare . 2020 . The Importance of Pessimism in Fixed-Dataset Policy Optimization . arXiv:2009.06799 https: //arxiv.org/abs/2009.06799 Jacob Buckman, Carles Gelada, and Marc G. Bellemare. 2020. The Importance of Pessimism in Fixed-Dataset Policy Optimization. arXiv:2009.06799 https: //arxiv.org/abs/2009.06799
5. Justin Fu Aviral Kumar Ofir Nachum George Tucker and Sergey Levine. 2020. D4RL: Datasets for Deep Data-Driven Reinforcement Learning. arXiv:2004.07219 https://arxiv.org/abs/2004.07219 Justin Fu Aviral Kumar Ofir Nachum George Tucker and Sergey Levine. 2020. D4RL: Datasets for Deep Data-Driven Reinforcement Learning. arXiv:2004.07219 https://arxiv.org/abs/2004.07219