Author:
Han Shuai,Zhou Wenbo,Lü Shuai,Zhu Sheng,Gong Xiaoyu
Funder
Fundamental Research Funds for the Central Universities
Natural Science Foundation of Jilin Province
Northeast Normal University
National Natural Science Foundation of China
Jilin University
National Key Research and Development Program of China
Subject
Artificial Intelligence,Information Systems and Management,Computer Science Applications,Theoretical Computer Science,Control and Systems Engineering,Software
Reference44 articles.
1. Abbas Abdolmaleki, Rudolf Lioutikov, Jan Peters, Nuno Lau, Luís Paulo Reis, and Gerhard Neumann. Model-based relative entropy stochastic search. In Advances in Neural Information Processing Systems, pages 3537–3545, 2015.
2. Marcin Andrychowicz, Dwight Crow, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, and Wojciech Zaremba. Hindsight experience replay. In Advances in Neural Information Processing Systems, pages 5048–5058, 2017.
3. Gradient temporal-difference learning for off-policy evaluation using emphatic weightings;Cao;Inf. Sci.,2021
4. Batch exploration with examples for scalable robotic reinforcement learning;Chen;IEEE Robot. Autom. Lett.,2021
5. John Schulman;Dhariwal,2016
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献