1. Barto, A.G., Bradtke, S.J. & Singh, S.P. (1991). Real-time learning and control using asynchronous dynamic programming. (COINS technical report 91–57). Amherst: University of Massachusetts.
2. Barto, A.G. & Singh, S.P. (1990). On the computational economics of reinforcement learning. In D.S. Touretzky, J. Elman, T.J. Sejnowski & G.E. Hinton, (Eds.), Proceedings of the 1990 Connectionist Models Summer School. San Mateo, CA: Morgan Kaufmann.
3. Bellman, R.E. & Dreyfus, S.E. (1962). Applied dynamic programming. RAND Corporation.
4. Proceedings of the 1991 International Joint Conference on Artificial Intelligence;D Chapman,1991
5. Kushner, H. & Clark, D. (1978). Stochastic approximation methods for constrained and unconstrained systems. Berlin, Germany: Springer-Verlag.