The effect of model uncertainties in the reinforcement learning based regulation problem: An experimental case study with inverted pendulum

Author:

Pal Amit Kumar1ORCID,Oveisi Atta1ORCID,Nestorović Tamara1

Affiliation:

1. Mechanics of Adaptive Systems (MAS) Ruhr‐Universität Bochum Bochum Germany

Abstract

AbstractThis work aims to improve the classical white‐box model of an inverted pendulum in order to reach a more accurate representation of an actual pendulum on a cart system. The purpose of the model is to train different controllers based on machine learning algorithms. In the context of this paper, the inverted pendulum system is driven by a belt drive that is controlled by a stepper motor. Due to the nature of the controller, the input to the stepper motor is in the form of a non‐smooth bang‐bang‐like signal that moves the cart to the left, right, or terminates its movement. One of the main challenges, in this case, is to find a proper function to model the stepper motor as its dynamics cannot be captured with a constant gain. It has been shown that the transient behavior of the stepper motor when changing direction or stopping is not negligible in the closed‐loop control performance. Accordingly, a grey‐box scheme, which accounts for the uncertainties that are not included in the vanilla white‐box model, is utilized to achieve a lower model mismatch compared to the actual pendulum. Initially, the equation of motion was derived using the Euler‐Lagrange equation with force on the cart as the control input. But in the real‐time experiment, the interface is realized by the stepper motor's frequency modulator, hence a transfer function representing the relationship between the frequency and the force applied on the cart (in the model) is calculated as a black‐box model. To improve the accuracy of the transfer function, an experimental data‐driven design of this function is performed based on modern schemes in system identification. For this purpose, the applied frequency to the stepper motor and the states from the actual system are recorded. Then, the applied force on the cart is calculated using the equation of motion and the recorded states. It is also shown that the frequency‐force transfer function uncertainty due to exogenous disturbances is non‐negligible and with the aim of having a more accurate model, an artificial neural network is introduced. Finally, the effectiveness of this grey‐box model is shown by training and implementing a deep Q‐network based controller to swing up and balance the inverted pendulum.

Publisher

Wiley

Subject

Electrical and Electronic Engineering,Atomic and Molecular Physics, and Optics

Reference13 articles.

1. Human-level control through deep reinforcement learning

2. Lillicrap T. P. Hunt J. J. Pritzel A. Heess N. Erez T. Tassa Y. Silver D. &Wierstra D.(2015).Continuous control with deep reinforcement learning.arXiv:1509.02971.

3. Mnih V. Badia A. P. Mirza M. Graves A. Lillicrap T. Harley T. Silver D. &Kavukcuoglu K.(2016).Asynchronous methods for deep reinforcement learning.International Conference on Machine Learning(pp.1928–1937).PMLR.

4. Gaussian Processes for Data-Efficient Learning in Robotics and Control

5. Trajectory Planning for Automated Parking Systems Using Deep Reinforcement Learning

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3