Curiosity Creates Diversity in Policy Search-Reference-Cited by-同舟云学术

Curiosity Creates Diversity in Policy Search

Published:2023-09-20 Issue:3 Volume:3 Page:1-20
ISSN:2688-299X
Container-title:ACM Transactions on Evolutionary Learning and Optimization
language:en
Short-container-title:ACM Trans. Evol. Learn. Optim.

Author:

Le Tolguenec Paul-Antoine¹^ORCID,Rachelson Emmanuel²^ORCID,Besse Yann³^ORCID,Wilson Dennis G.²^ORCID

Affiliation:

1. ISAE-Supaero, Université de Toulouse, Airbus, France

2. ISAE-Supaero, Université de Toulouse, France

3. Airbus, France

Abstract

When searching for policies, reward-sparse environments often lack sufficient information about which behaviors to improve upon or avoid. In such environments, the policy search process is bound to blindly search for reward-yielding transitions and no early reward can bias this search in one direction or another. A way to overcome this is to use intrinsic motivation in order to explore new transitions until a reward is found. In this work, we use a recently proposed definition of intrinsic motivation, Curiosity, in an evolutionary policy search method. We propose Curiosity-ES, 1 an evolutionary strategy adapted to use Curiosity as a fitness metric. We compare Curiosity-ES with other evolutionary algorithms intended for exploration, as well as with Curiosity-based reinforcement learning, and find that Curiosity-ES can generate higher diversity without the need for an explicit diversity criterion and leads to more policies which find reward.

Publisher

Association for Computing Machinery (ACM)

Subject

Process Chemistry and Technology,Economic Geology,Fuel Technology

Link

https://dl.acm.org/doi/pdf/10.1145/3605782

Reference45 articles.

1. Ferran Alet Martin F. Schneider Tomás Lozano-Pérez and Leslie Pack Kaelbling. 2020. Meta-learning curiosity algorithms. (2020).

2. Hindsight experience replay;Andrychowicz Marcin;Advances in Neural Information Processing Systems,2017

3. A survey on intrinsic motivation in reinforcement learning;Aubret Arthur;arXiv preprint arXiv:1908.06976,2019

4. Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Martin Arjovsky, Alexander Pritzel, Andrew Bolt, and Charles Blundell. 2020. Never give up: Learning directed exploration strategies. In International Conference on Learning Representations. https://openreview.net/forum?id=Sye57xStvB.

5. R-max-a general polynomial time algorithm for near-optimal reinforcement learning;Brafman Ronen I.;Journal of Machine Learning Research,2002

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploration-Driven Reinforcement Learning for Avionic System Fault Detection (Experience Paper);Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis;2024-09-11

2. Summary of "Curiosity creates Diversity in Policy Search";Proceedings of the Genetic and Evolutionary Computation Conference Companion;2024-07-14