1. Abbeel, P., Coates, A., Quigley, M., & Ng, A. Y. (2007). An application of reinforcement learning to aerobatic helicopter flight. In Proceedings of advances in neural information processing systems (pp. 1–8).
2. Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., & Welinder, P. et al. (2017). Hindsight experience replay. In Proceedings of advances in neural information processing systems.
3. Baranes, A., & Oudeyer, P. Y. (2013). Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1), 49–73.
4. Bellemare, M., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., & Munos, R. (2016). Unifying count-based exploration and intrinsic motivation. In Proceedings of advances in neural information processing systems (pp. 1471–1479).
5. Burda, Y., Edwards, H., Storkey, A. J., & Klimov, O. (2018). Exploration by random network distillation.
arXiv:1810.12894
.