1. C.W. Anderson, Learning and problem solving with multilayer connectionist systems, PhD thesis, Computer and Information Science, University of Massachusetts, 1986
2. Learning to control an inverted pendulum using neural networks;Anderson;IEEE Control Systems Magazine,1989
3. C.W. Anderson, Q-learning with hidden-unit restarting, in: Advances in Neural Information Processing Systems, 1993, pp. 81–88
4. L.C. Baird, Residual algorithms: Reinforcement learning with function approximation, in: International Conference on Machine Learning, 1995, pp. 30–37
5. Monte Carlo matrix inversion and reinforcement learning;Barto,1994