1. Barto, A. G., Bradtke, S. J. & Singh, S. P. (1993). Learning to Act using Real-Time Dynamic Programming. Department of Computer Science, Univ. of Massachusetts, Amherst, MA 01003. Submitted to AI journal, special issue on Computational Theories of Interaction and Agency.
2. Neuronlike adaptive elements that can solve difficult learning control problems;Barto;IEEE Transactions on Systems, Man, and Cybernetics,1983
3. Bertsekas, D. P. & Rhodes, I. B. (1971), On the Minimax Feedback Control of Uncertain Systems. Proceedings of the IEEE Decision and Control Conference, Miami, Dec 1971, pp. 451–455.
4. Probability; Decision; Statistics;Bradley,1976
5. Heger, M. & Berns, K. (1992). Risikoloses Reinforcement-Lernen. KI, Künstliche Intelligenz: Forschung, Entwicklung, Verfahren, no 4, pp. 26–32, Organ des Fachbereichs 1 Künstliche Intelligenz der Gesellschaft fiir Informatik e.V. (GI).