1. Anderson, C., & Miller, W. (1990). Challenging control problems. In Neural networks for control (pp. 475–410).
2. Anderson, C. W., Hittle, D., Katz, A., & Kretchmar, R. M. (1997). Synthesis of reinforcement learning, neural networks, and pi control applied to a simulated heating coil. Journal of Artificial Intelligence in Engineering, 11(4), 423–431.
3. Bellman, R. (1957). Dynamic programming. Princeton: Princeton Univ Press.
4. Boyan, J., & Littman, M. (1994). Packet routing in dynamically changing networks—a reinforcement learning approach. In J. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems 6.
5. Crites, R. H., & Barto, A. G. (1996). Improving elevator performance using reinforcement learning. In: Andvances in neural information processing systems 8.