1. M Mehdi Afsar Trafford Crump and Behrouz Far. 2021. Reinforcement learning based recommender systems: A survey. arXiv preprint arXiv:2101.06286. M Mehdi Afsar Trafford Crump and Behrouz Far. 2021. Reinforcement learning based recommender systems: A survey. arXiv preprint arXiv:2101.06286.
2. Sanjeev Arora , Rong Ge , Yingyu Liang , Tengyu Ma , and Yi Zhang . 2017 . Generalization and equilibrium in generative adversarial nets (gans) . In International Conference on Machine Learning. PMLR, 224--232 . Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, and Yi Zhang. 2017. Generalization and equilibrium in generative adversarial nets (gans). In International Conference on Machine Learning. PMLR, 224--232.
3. Jimmy Lei Ba , Jamie Ryan Kiros, and Geoffrey E Hinton . 2016 . Layer normalization. arXiv preprint arXiv:1607.06450. Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450.
4. Richard Bellman . 1966 . Dynamic programming . Science , Vol. 153 , 3731, 34--37. Richard Bellman. 1966. Dynamic programming. Science, Vol. 153, 3731, 34--37.
5. Jiawei Chen Hande Dong Xiang Wang Fuli Feng Meng Wang and Xiangnan He. 2020. Bias and debias in recommender system: A survey and future directions. arXiv preprint arXiv:2010.03240. Jiawei Chen Hande Dong Xiang Wang Fuli Feng Meng Wang and Xiangnan He. 2020. Bias and debias in recommender system: A survey and future directions. arXiv preprint arXiv:2010.03240.