Author:
Hu Guanghua,Qiu Yuqin,Xiang Liming
Publisher
Springer Berlin Heidelberg
Reference12 articles.
1. Sutton, R.S.: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3, 9–44 (1988)
2. Watkins, C.J.C.H.: Q-Learning. Machine Learning 8, 279–292 (1992)
3. Santharam, G., Sastry, P.S.: A Reinforcement Learning Neural Network for Adaptive Control Markov Chains. IEEE Transactions on System, Man and Cybernetics-Part A 27, 588–600 (1997)
4. Tsitsiklis, J.N., Roy, B.V.: An Analysis of Temporal-Difference Learning with Function Approximation. IEEE Transactions on Automatic Control 42, 674–690 (1997)
5. Tsitsiklis, J.N., Roy, B.V.: Feature-Based Methods for Large Scale Dynamic Programming. Machine Learning 22, 59–94 (1996)