1. BartoA.G., BradtkeS.J. & SinghS.P. (1995). Learning to act using real-time dynamic programming.Artificial Intelligence,72, 81?138.
2. BartoA.G., SuttonR.S. & WatkinsC.J.C.H. (1989). Learning and sequential decision making. In MGabriel & JMoore, editors,Learning and Computational Neuroscience: Foundations of Adaptive Networks. Cambridge, MA: MIT Press, Bradford Books.
3. BertsekasD. & ShreveS.E. (1978).Stochastic Optimal Control: The Discrete Time Case. New York, NY: Academic Press.
4. CohnD.A. (1994). Neural network exploration using optimal experiment design. In JDCowan, GTesauro & JAllspector, editors,Advances in Neural Information Processing Systems, 6. San Mateo, CA: Morgan Kaufmann, 679?686.
5. Techical Report;J.M. Cozzolino,1965