Author:
De Farias D. P.,Van Roy B.
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Management Science and Operations Research,Control and Optimization
Reference11 articles.
1. BELLMAN, R., and DREYFUS, S., Functional Approximations and Dynamic Programming, Mathematical Tables and Other Aids to Computation, Vol. 13, pp. 247-251, 1959.
2. SUTTON, R. S., Learning to Predict by the Method of Temporal Differences, Machine Learning, Vol. 3, pp. 9-44, 1988.
3. GURVITS, L., LIN, L. J., and HANSON, S. J., Incremental Learning of Evaluation Functions for Absorbing Markov Chains: New Methods and Theorems, Preprint, 1994.
4. PINEDA, F., Mean-Field Analysis for Batched TD(ℓ), Neural Computation, Vol. 9, pp. 1403-1419, 1997.
5. TSITSIKLIS, J. N., and VAN ROY, B., An Analysis of Temporal-Difference Learning with Function Approximation, IEEE Transactions on Automatic Control, Vol. 42, pp. 674-690, 1997.
Cited by
29 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献