Linear Least-Squares algorithms for temporal difference learning-Reference-Cited by-同舟云学术

Linear Least-Squares algorithms for temporal difference learning

Published:1996 Issue:1-3 Volume:22 Page:33-57
ISSN:0885-6125
Container-title:Machine Learning
language:en
Short-container-title:Mach Learn

Author:

Bradtke Steven J.,Barto Andrew G.

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Software

Link

http://link.springer.com/content/pdf/10.1007/BF00114723.pdf

Reference24 articles.

1. Technical Report 87-509.3;C. W. Anderson,1988

2. Barto, A. G., Sutton, R. S. & Anderson, C. W. (1983) Neuronlike elements that can solve difficult learning control problems.IEEE Transactions on Systems, Man, and Cybernetics, 13: 835?846.

3. Bradtke, S. J., (1994).Incremental Dynamic Programming for On-Line Adaptive Optimal Control. PhD thesis, University of Massachusetts, Computer Science Dept. Technical Report 94-62.

4. Darken, C. Chang, J. & Moody, J., (1992) Learning rate schedules for faster stochastic gradient search. InNeural Networks for Signal Processing 2 ? Proceedings of the 1992 IEEE Workshop. IEEE Press.

5. Dayan, P., (1992). The convergence of TD(?) for general ?.Machine Learning, 8: 341?362.

Cited by 220 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization;SIAM Journal on Optimization;2024-09-10

2. Efficient Offline Reinforcement Learning With Relaxed Conservatism;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-08

3. On the Analysis of Model-Free Methods for the Linear Quadratic Regulator;Journal of the Operations Research Society of China;2024-07-09

4. A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning;SIAM Journal on Optimization;2024-03-08

5. Cooperative Finitely Excited Learning for Dynamical Games;IEEE Transactions on Cybernetics;2024-02