1. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction, vol. 1. MIT Press, Cambridge (1998)
2. Vrabie, D., Vamvoudakis, K.G., Lewis, F.L.: Optimal adaptive control and differential games by reinforcement learning principles, Vol. 2. IET (2013)
3. Watkins, C.J.C.H.: Learning from delayed rewards. University of Cambridge England, PhD thesis (1989)
4. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3-4), 279–292 (1992)
5. Werbos, P.J.: A menu of designs for reinforcement learning over time. Neural Networks for Control, pp. 67–95 (1990)