1. [1] R. S. Sutton and A. G. Barto: Reinforcement Learning, MIT Press (1998)
2. [2] D. P. Bertsekas and J. N. Tsitsiklis: Neuro-Dynamic Programming, Athena Scientfic (1996)
3. [3] J. Kober, D. Bagnell and J. Peters: Reinforcement learning in robotics: A survey; Int. J. Robotics Research, Vol. 32, No. 11, pp. 1238-1274 (2013)
4. [4] R. S. Sutton, et al.: Policy gradient methods for reinforcement learning with function approximation; Proc. NIPS 12, pp. 1057-1063 (1999)
5. [6] M. Strens: A Bayesian framework for reinforcement learning; Proc. ICML, pp. 943-950 (2000)