1. The theory of dynamic programming
2. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
3. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
4. Dayan, P. and Hinton, G.E. (1992) Feudal Reinforcement Learning. Advances in Neural Information Processing Systems, 5, 272-278.
5. Vezhnevets, A.S., Osindero, S., Schaul, T., et al. (2017) Feudal Networks for Hierarchical Reinforcement Learning. International Conference on Machine Learning, Sydney, 6 August 2017, 3540-3549.