Goal-Conditioned Generators of Deep Policies-Reference-Cited by-同舟云学术

Goal-Conditioned Generators of Deep Policies

Published:2023-06-26 Issue:6 Volume:37 Page:7503-7511
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Faccio Francesco,Herrmann Vincent,Ramesh Aditya,Kirsch Louis,Schmidhuber Jürgen

Abstract

Goal-conditioned Reinforcement Learning (RL) aims at learning optimal policies, given goals encoded in special command inputs. Here we study goal-conditioned neural nets (NNs) that learn to generate deep NN policies in form of context-specific weight matrices, similar to Fast Weight Programmers and other methods from the 1990s. Using context commands of the form ``generate a policy that achieves a desired expected return,'' our NN generators combine powerful exploration of parameter space with generalization across commands to iteratively find better and better policies. A form of weight-sharing HyperNetworks and policy embeddings scales our method to generate deep NNs. Experiments show how a single learned policy generator can produce policies that achieve any return seen during training. Finally, we evaluate our algorithm on a set of continuous control tasks where it exhibits competitive performance. Our code is public.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Learning One Abstract Bit at a Time Through Self-invented Experiments Encoded as Neural Networks;Active Inference;2023-11-16

2. Learning to Identify Critical States for Reinforcement Learning from Videos;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01