Identifying efficient curricula for reinforcement learning in complex environments with a fixed computational budget-Reference-Cited by-同舟云学术

Identifying efficient curricula for reinforcement learning in complex environments with a fixed computational budget

Published:2022-01-08 Issue: Volume: Page:
ISSN:
Container-title:5th Joint International Conference on Data Science & Management of Data (9th ACM IKDD CODS and 27th COMAD)
language:
Short-container-title:

Author:

Shelke Omkar¹,Meisheri Hardik¹,Khadilkar Harshad¹

Affiliation:

1. TCS Research, IN

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3493700.3493709

Reference30 articles.

1. Yusuf Aytar Tobias Pfaff David Budden Thomas Paine Ziyu Wang and Nando de Freitas. 2018. Playing hard exploration games by watching youtube. In Advances in Neural Information Processing Systems. 2930–2941. Yusuf Aytar Tobias Pfaff David Budden Thomas Paine Ziyu Wang and Nando de Freitas. 2018. Playing hard exploration games by watching youtube. In Advances in Neural Information Processing Systems. 2930–2941.

2. Michael Bain and Claude Sammut. 1995. A Framework for Behavioural Cloning.. In Machine Intelligence 15. 103–129. Michael Bain and Claude Sammut. 1995. A Framework for Behavioural Cloning.. In Machine Intelligence 15. 103–129.

3. Daniel S Bernstein , Robert Givan , Neil Immerman , and Shlomo Zilberstein . 2002. The complexity of decentralized control of Markov decision processes. Mathematics of operations research 27, 4 ( 2002 ), 819–840. Daniel S Bernstein, Robert Givan, Neil Immerman, and Shlomo Zilberstein. 2002. The complexity of decentralized control of Markov decision processes. Mathematics of operations research 27, 4 (2002), 819–840.

4. Jack Clark and Dario Amodei. 2016. Faulty Reward Functions in the Wild. https://openai.com/blog/faulty-reward-functions/. Jack Clark and Dario Amodei. 2016. Faulty Reward Functions in the Wild. https://openai.com/blog/faulty-reward-functions/.

5. Hal Daumé , John Langford , and Daniel Marcu . 2009. Search-based structured prediction. Machine learning 75, 3 ( 2009 ), 297–325. Hal Daumé, John Langford, and Daniel Marcu. 2009. Search-based structured prediction. Machine learning 75, 3 (2009), 297–325.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Review of the Evaluation System for Curriculum Learning;Electronics;2023-04-01