1. Achiam, J., Knight, E., Abbeel, P.: Towards Characterizing Divergence in Deep Q-Learning. arXiv:1903.08894 (2019)
2. Benureau, F.C.Y., Oudeyer, P.Y.: Behavioral diversity generation in autonomous exploration through reuse of past experience. Front. Robot. AI 3, 8 (2016)
3. Burda, Y., Edwards, H., Storkey, A., Klimov, O.: Exploration by Random Network Distillation. arXiv:1810.12894 (2018)
4. Chiang, H.T.L., Hsu, J., Fiser, M., Tapia, L., Faust, A.: RL-RRT: Kinodynamic Motion Planning via Learning Reachability Estimators from RL Policies. arXiv:1907.04799 (2019)
5. Ciosek, K., Vuong, Q., Loftin, R., Hofmann, K.: Better Exploration with Optimistic Actor-Critic. arXiv:1910.12807 (2019)