1. L. Dong, Z. He, C. Song and C. Sun, "A Review of Mobile Robot Motion Planning Methods: From Classical Motion Planning Workflows to Reinforcement Learning-Based Architectures," in Journal of Systems Engineering and Electronics, edited by R. Shi (Beijing Institute of Aerospace Information, Beijing, 2023), pp. 439–459.
2. M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, P. Abbeel, and W. Zaremba, "Hindsight experience replay," in Advances in Neural Information Processing Systems 30, edited by I. Guyon et al. (Curran Associates Inc, Red Hook, 2017), pp. 1645–1653.
3. L. Schramm, Y. Deng, E. Granados, and A. Boularias, "USHER: unbiased sampling for hindsight experience replay," in Conference on Robot Learning-2022, Proceedings of Machine Learning Research 205, edited by K. Liu et al. (PMLR, Auckland, 2022), pp. 2073–2082.
4. Y. Burda, H. Edwards, D. Pathak, A. Storkey, T. Darrell, and A. A. Efros, "Large-Scale Study of Curiosity-Driven Learning," in Proceedings of the 34th International Conference on Machine Learning, edited by D. Precup et al. (PMLR, Sydney, 2017), pp. 2631–2640.
5. R. S. Sutton, D. Precup, and S. Singh, “Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning,” in Artificial Intelligence, edited by S. Thiebaux et al. (Elsevier, Amsterdam, 1999), pp. 181–211.