An off-policy least square algorithms with eligibility trace based on importance reweighting-Reference-Cited by-同舟云学术

An off-policy least square algorithms with eligibility trace based on importance reweighting

Published:2017-09-12 Issue:4 Volume:20 Page:3475-3487
ISSN:1386-7857
Container-title:Cluster Computing
language:en
Short-container-title:Cluster Comput

Author:

Zhang Haifei,Hong Ying,Qiu Jianlin

Funder

Universities Natural Science Research Project of Jiangsu Province

Universities Natural Science Research Project of Anhui Province

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Software

Link

http://link.springer.com/article/10.1007/s10586-017-1165-0/fulltext.html

Reference26 articles.

1. Sutton, R.S., Barto, G.A.: Reinforcement Learning. MIT Press, Cambridge (1998)

2. Koller, D., Parr, R.: Policy iteration for factored MDPs. In: Proceedings of the 16th Conference on Uncertain in Artificial Intelligence, Stanford, USA (2000)

3. Andoh, A., Kobayashi, T., Kuzuoka, H., Tsujikawa, T., Suzuki, Y.: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Mach. Learn. 71(1), 89–129 (2008)

4. Tsitsiklis, J.N., Van Roy, B.: An analysis of temporal-difference learning with function approximation. IEEE Trans. Autom. Control 42, 674–690 (1997)

5. Geist, M., Pietquin, O.: Parametric value function approximation: a unified view. In: Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Piscataway, USA (2011)

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Application of Partial Least Squares Method Based on Big Data Analysis Technology in Sensor Error Compensation;Lecture Notes in Electrical Engineering;2022