1. M Mehdi Afsar, Trafford Crump, and Behrouz Far. 2021. Reinforcement learning based recommender systems: A survey. arXiv preprint arXiv:2101.06286 (2021).
2. Yuri Burda, Harrison Edwards, Amos Storkey, and Oleg Klimov. 2018. Exploration by random network distillation. arXiv preprint arXiv:1810.12894 (2018).
3. Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, 2023. Two-Stage Constrained Actor-Critic fo Short Video Recommendation. arXiv preprint arXiv:2302.01680 (2023).
4. Large-Scale Interactive Recommendation with Tree-Structured Policy Gradient
5. Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation