Affiliation:
1. Technische Universität München, Germany
2. AxiomZen, Canada
3. Siemens AG, Germany
Abstract
This chapter introduces a model-based reinforcement learning (RL) approach for continuous state and action spaces. While most RL methods try to find closed-form policies, the approach taken here employs numerical online optimization of control action sequences following the strategy of nonlinear model predictive control. First, a general method for reformulating RL problems as optimization tasks is provided. Subsequently, particle swarm optimization (PSO) is applied to search for optimal solutions. This PSO policy (PSO-P) is effective for high dimensional state spaces and does not require a priori assumptions about adequate policy representations. Furthermore, by translating RL problems into optimization tasks, the rich collection of real-world-inspired RL benchmarks is made available for benchmarking numerical optimization techniques. The effectiveness of PSO-P is demonstrated on two standard benchmarks mountain car and cart-pole swing-up and a new industry-inspired benchmark, the so-called industrial benchmark.
Reference42 articles.
1. Particle Swarm Optimization and Sequential Sampling in Noisy Environments
2. Model Predictive control
3. Variational inference for latent variables and uncertain inputs in Gaussian processes.;A. C.Damianou;Journal of Machine Learning Research,2016
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Harvesting Tomorrow;Advances in Computational Intelligence and Robotics;2024-02-23