The effect of model uncertainties in the reinforcement learning based regulation problem: An experimental case study with inverted pendulum-Reference-Cited by-同舟云学术

The effect of model uncertainties in the reinforcement learning based regulation problem: An experimental case study with inverted pendulum

Published:2023-10-09 Issue:4 Volume:23 Page:
ISSN:1617-7061
Container-title:PAMM
language:en
Short-container-title:Proc Appl Math and Mech

Author:

Pal Amit Kumar¹^ORCID,Oveisi Atta¹^ORCID,Nestorović Tamara¹

Affiliation:

1. Mechanics of Adaptive Systems (MAS) Ruhr‐Universität Bochum Bochum Germany

Abstract

AbstractThis work aims to improve the classical white‐box model of an inverted pendulum in order to reach a more accurate representation of an actual pendulum on a cart system. The purpose of the model is to train different controllers based on machine learning algorithms. In the context of this paper, the inverted pendulum system is driven by a belt drive that is controlled by a stepper motor. Due to the nature of the controller, the input to the stepper motor is in the form of a non‐smooth bang‐bang‐like signal that moves the cart to the left, right, or terminates its movement. One of the main challenges, in this case, is to find a proper function to model the stepper motor as its dynamics cannot be captured with a constant gain. It has been shown that the transient behavior of the stepper motor when changing direction or stopping is not negligible in the closed‐loop control performance. Accordingly, a grey‐box scheme, which accounts for the uncertainties that are not included in the vanilla white‐box model, is utilized to achieve a lower model mismatch compared to the actual pendulum. Initially, the equation of motion was derived using the Euler‐Lagrange equation with force on the cart as the control input. But in the real‐time experiment, the interface is realized by the stepper motor's frequency modulator, hence a transfer function representing the relationship between the frequency and the force applied on the cart (in the model) is calculated as a black‐box model. To improve the accuracy of the transfer function, an experimental data‐driven design of this function is performed based on modern schemes in system identification. For this purpose, the applied frequency to the stepper motor and the states from the actual system are recorded. Then, the applied force on the cart is calculated using the equation of motion and the recorded states. It is also shown that the frequency‐force transfer function uncertainty due to exogenous disturbances is non‐negligible and with the aim of having a more accurate model, an artificial neural network is introduced. Finally, the effectiveness of this grey‐box model is shown by training and implementing a deep Q‐network based controller to swing up and balance the inverted pendulum.

Publisher

Wiley

Subject

Electrical and Electronic Engineering,Atomic and Molecular Physics, and Optics

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/pamm.202300130

Reference13 articles.

1. Human-level control through deep reinforcement learning

2. Lillicrap T. P. Hunt J. J. Pritzel A. Heess N. Erez T. Tassa Y. Silver D. &Wierstra D.(2015).Continuous control with deep reinforcement learning.arXiv:1509.02971.

3. Mnih V. Badia A. P. Mirza M. Graves A. Lillicrap T. Harley T. Silver D. &Kavukcuoglu K.(2016).Asynchronous methods for deep reinforcement learning.International Conference on Machine Learning(pp.1928–1937).PMLR.

4. Gaussian Processes for Data-Efficient Learning in Robotics and Control

5. Trajectory Planning for Automated Parking Systems Using Deep Reinforcement Learning