1. Natural Gradient Works Efficiently in Learning
2. Deep Reinforcement Learning: A Brief Survey
3. Tim De Bruin Jens Kober Karl Tuyls and Robert Babuška. 2015. The importance of experience replay database composition in deep reinforcement learning. In Deep reinforcement learning workshop NIPS. Tim De Bruin Jens Kober Karl Tuyls and Robert Babuška. 2015. The importance of experience replay database composition in deep reinforcement learning. In Deep reinforcement learning workshop NIPS.
4. Lih-Yuan Deng. 2006. The cross-entropy method: a unified approach to combinatorial optimization Monte-Carlo simulation and machine learning. Lih-Yuan Deng. 2006. The cross-entropy method: a unified approach to combinatorial optimization Monte-Carlo simulation and machine learning.
5. Scott Fujimoto , Herke Hoof , and David Meger . 2018 . Addressing function approximation error in actor-critic methods . In International conference on machine learning. PMLR, 1587--1596 . Scott Fujimoto, Herke Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. In International conference on machine learning. PMLR, 1587--1596.