Solving Deep Memory POMDPs with Recurrent Policy Gradients-Reference-Cited by-同舟云学术

Solving Deep Memory POMDPs with Recurrent Policy Gradients

Published:2007 Issue: Volume: Page:697-706
ISSN:0302-9743
Container-title:Lecture Notes in Computer Science
language:
Short-container-title:

Author:

Wierstra Daan,Foerster Alexander,Peters Jan,Schmidhuber Jürgen

Publisher

Springer Berlin Heidelberg

Link

http://link.springer.com/content/pdf/10.1007/978-3-540-74690-4_71

Reference22 articles.

1. Benbrahim, H., Franklin, J.: Biped dynamic walking using reinforcement learning. Robotics and Autonomous Systems Journal (1997)

2. Moody, J., Saffell, M.: Learning to Trade via Direct Reinforcement. IEEE Transactions on Neural Networks 12(4), 875–889 (2001)

3. Prokhorov, D.: Toward effective combination of off-line and on-line training in adp framework. In: ADPRL. Proceedings of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, IEEE Computer Society Press, Los Alamitos (2007)

4. Baxter, J., Bartlett, P., Weaver, L.: Experiments with infinite-horizon, policy- gradient estimation. Journal of Artificial Intelligence Research 15, 351–381 (2001)

5. Peters, J., Schaal, S.: Policy gradient methods for robotics. In: IROS. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp. 2219–2225 (2006)

Cited by 32 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Research on heterogeneous multi-UAV collaborative decision-making method based on improved PPO;Applied Intelligence;2024-07-29

2. Enhancing Student Engagement in Online Learning Through Strategy Gradient Reinforcement Learning;Learning and Analytics in Intelligent Systems;2024

3. Asymmetric Actor-Critic with Approximate Information State;2023 62nd IEEE Conference on Decision and Control (CDC);2023-12-13

4. Deep Attention Q-Network for Personalized Treatment Recommendation;2023 IEEE International Conference on Data Mining Workshops (ICDMW);2023-12-04

5. Olfactory search with finite-state controllers;Proceedings of the National Academy of Sciences;2023-08-14