1. A. G. Barto, R. S. Sutton, and C. J. C. H. Watkins. Learning and sequential decision making. Technical Report COINS 89–95, Dept, of Computer and Information Science, University of Massachusetts, Amherst, 1989.
2. Parallel and Distributed Computation: Numerical Methods;Bertsekas,1989
3. Discrete dynamic programming;Blackwell;Ann. Math. Statist.,1962
4. Divergent Series;Hardy,1949
5. Dynamic Programming and Markov Processes;Howard,1960