1. J. Abounadi, D. Bertsekas, V. Borkar, Ode analysis for Q-learning algorithms, LIDS Report, MIT, Cambridge, MA, 1996
2. The theory of dynamic programming;Bellman;Bulletin of American Mathematical Society,1954
3. Dynamic Programming and Optimal Control;Bertsekas,1995
4. Neuro-Dynamic Programming;Bertsekas,1996
5. Stochastic approximation with two-time scales;Borkar;System and Control Letters,1997