1. Afsar, M.M., Crump, T., Far, B.: Reinforcement learning based recommender systems: a survey (2021)
2. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence);B Auslander,2008
3. Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013). http://arxiv.org/abs/1308.3432
4. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence);RAC Bianchi,2009
5. Blundell, C., et al.: Model-free episodic control. CoRR abs/1606.04460 (2016). http://dblp.uni-trier.de/db/journals/corr/corr1606.html#BlundellUPLRLRW16