1. Abbeel, P., & Ng, A. Y. (2005). Learning first-order markov models for control. Advances in neural information processing systems, 17, 1–8.
2. Asadi, K. , Misra, D., Kim, S., & Littman, M.L. (2019). Combating the compounding-error problem with a multi-step model. arXiv preprint arXiv:1905.13320.
3. Asadi, K. , Misra, D., & Littman, M. (2018). Lipschitz continuity in model-based reinforcement learning. International conference on machine learning (pp. 264–273).
4. Cho, Y., Kim, J., & Kim, J. (2021). Intent inference-based ship collision avoidance in encounters with rule-violating vessels. IEEE Robotics and Automation Letters, 7(1), 518–525.
5. Co-Reyes, J., Liu, Y., Gupta, A., Eysenbach, B. , Abbeel, P., & Levine, S. (2018). Self-consistent trajectory autoencoder: Hierarchical reinforcement learning with trajectory embeddings. International conference on machine learning (pp. 1009–1018).