The Option-Critic Architecture-Reference-Cited by-同舟云学术

The Option-Critic Architecture

Published:2017-02-13 Issue:1 Volume:31 Page:
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Bacon Pierre-Luc,Harb Jean,Precup Doina

Abstract

Temporal abstraction is key to scaling up learning and planning in reinforcement learning. While planning with temporally extended actions is well understood, creating such abstractions autonomously from data has remained challenging.We tackle this problem in the framework of options [Sutton,Precup and Singh, 1999; Precup, 2000]. We derive policy gradient theorems for options and propose a new option-critic architecture capable of learning both the internal policies and the termination conditions of options, in tandem with the policy over options, and without the need to provide any additional rewards or subgoals. Experimental results in both discrete and continuous environments showcase the flexibility and efficiency of the framework.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 152 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Analyzing Operator States and the Impact of AI-Enhanced Decision Support in Control Rooms: A Human-in-the-Loop Specialized Reinforcement Learning Framework for Intervention Strategies;International Journal of Human–Computer Interaction;2024-09-04

2. Auxiliary Network Enhanced Hierarchical Graph Reinforcement Learning for Vehicle Repositioning;IEEE Transactions on Intelligent Transportation Systems;2024-09

3. Predictive air combat decision model with segmented reward allocation;Complex & Intelligent Systems;2024-07-22

4. Temporally extended successor feature neural episodic control;Scientific Reports;2024-07-02

5. Hierarchical Knowledge-Enhancement Framework for multi-hop knowledge graph reasoning;Neurocomputing;2024-07