Affiliation:
1. Department of Computer Science, Dalhousie University, 6050 University Avenue, Halifax, NS, B3H 4R2, Canada
Abstract
Algorithms that learn through environmental interaction and delayed rewards, or reinforcement learning (RL), increasingly face the challenge of scaling to dynamic, high-dimensional, and partially observable environments. Significant attention is being paid to frameworks from deep learning, which scale to high-dimensional data by decomposing the task through multilayered neural networks. While effective, the representation is complex and computationally demanding. In this work, we propose a framework based on genetic programming which adaptively complexifies policies through interaction with the task. We make a direct comparison with several deep reinforcement learning frameworks in the challenging Atari video game environment as well as more traditional reinforcement learning frameworks based on a priori engineered features. Results indicate that the proposed approach matches the quality of deep learning while being a minimum of three orders of magnitude simpler with respect to model complexity. This results in real-time operation of the champion RL agent without recourse to specialized hardware support. Moreover, the approach is capable of evolving solutions to multiple game titles simultaneously with no additional computational cost. In this case, agent behaviours for an individual game as well as single agents capable of playing all games emerge from the same evolutionary run.
Subject
Computational Mathematics
Cited by
23 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献