Taking complementary advantages: Improving exploration via double self-imitation learning in procedurally-generated environments-Reference-Cited by-同舟云学术

Taking complementary advantages: Improving exploration via double self-imitation learning in procedurally-generated environments

Published:2024-03 Issue: Volume:238 Page:122145
ISSN:0957-4174
Container-title:Expert Systems with Applications
language:en
Short-container-title:Expert Systems with Applications

Author:

Lin Hao^ORCID,He Yue^ORCID,Li Fanzhang^ORCID,Liu Quan^ORCID,Wang Bangjun^ORCID,Zhu Fei^ORCID

Funder

National Natural Science Foundation of China

Natural Science Foundation of Jiangsu Province

National Key Research and Development Program of China

Priority Academic Program Development of Jiangsu Higher Education Institutions

Publisher

Elsevier BV

Subject

Artificial Intelligence,Computer Science Applications,General Engineering

Reference50 articles.

1. Abbeel, P., & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on machine learning (pp. 1–8).

2. Aytar, Y., Pfaff, T., Budden, D., Paine, T. L., Wang, Z., & Freitas, N. d. (2018). Playing hard exploration games by watching YouTube. In Proceedings of the 32nd international conference on neural information processing systems (pp. 2935–2945).

3. Never give up: Learning directed exploration strategies;Badia,2020

4. Badia, A. P., Sprechmann, P., Vitvitskyi, A., Guo, D., Piot, B., Kapturowski, S., et al. (2019). Never Give Up: Learning Directed Exploration Strategies. In International conference on learning representations.

5. Bellemare, M. G., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., & Munos, R. (2016). Unifying count-based exploration and intrinsic motivation. In Proceedings of the 30th international conference on neural information processing systems (pp. 1479–1487).

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A two-stage framework for parking search behavior prediction through adversarial inverse reinforcement learning and transformer;Expert Systems with Applications;2024-12

2. Reinforcement learning from suboptimal demonstrations based on Reward Relabeling;Expert Systems with Applications;2024-12