1. Sanjeevan Ahilan and Peter Dayan. Feudal multi-agent hierarchies for cooperative reinforcement learning. arXiv preprint arXiv:1901.08492, 2019.
2. Safa Alver. The option-critic architecture. https://alversafa.github.io/blog/2018/11/28/optncrtc.html, 2018.
3. Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, and Wojciech Zaremba. Hindsight experience replay. In Advances in Neural Information Processing Systems, pages 5048–5058, 2017.
4. Arthur Aubret, Laetitia Matignon, and Salima Hassas. A survey on intrinsic motivation in reinforcement learning. arXiv preprint arXiv:1908.06976, 2019.
5. Christer Backstrom and Peter Jonsson. Planning with abstraction hierarchies can be exponentially less efficient. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, volume 2, pages 1599–1604, 1995.