1. Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 1998
2. Watkins C J, Dayan P. Q-learning. Mach Learn, 1992, 8: 279–292
3. Rummery G A, Niranjan M. On-line Q-learning using connectionist systems. Cambridge: University of Cambridge, Department of Engineering, 1994, 37: 20
4. Wiering M, Schmidhuber J. HQ-learning. Adaptive Behav, 1997, 6: 219–246
5. Chen C L, Dong D Y, Li H-X, et al. Hybrid MDP based integrated hierarchical Q-learning. Sci China Inf Sci, 2011, 54: 2279–2294