1. Discount factor as a regularizer in reinforcement learning;Amit,2020
2. Sulla determinazione empirica di una legge didistribuzione;An;Giorn Dell’inst Ital Degli Att,1933
3. Residual algorithms: Reinforcement learning with function approximation;Baird,1995
4. Logistic Q-learning;Bas-Serrano,2021
5. Bayramoğlu, Ö. Z., Erzin, E., Sezgin, T. M., & Yemez, Y. (2021). Engagement rewarded actor-critic with conservative Q-learning for speech-driven laughter backchannel generation. In Proceedings of the 2021 international conference on multimodal interaction (pp. 613–618).