1. Surprise-based intrinsic motivation for deep reinforcement learning;Achiam,2017
2. Hindsight experience replay;Andrychowicz,2017
3. Azar, M. G., Osband, I., & Munos, R. (2017). Minimax Regret Bounds for Reinforcement Learning. In International conference on machine learning (pp. 263–272).
4. Bellemare, M. G., Dabney, W., & Munos, R. (2017). A Distributional Perspective on Reinforcement Learning. In International conference on machine learning (pp. 449–458).
5. Unifying count-based exploration and intrinsic motivation;Bellemare;Advances in Neural Information Processing Systems,2016