Improving Model-Based Deep Reinforcement Learning with Learning Degree Networks and Its Application in Robot Control-Reference-Cited by-同舟云学术

Improving Model-Based Deep Reinforcement Learning with Learning Degree Networks and Its Application in Robot Control

Published:2022-03-04 Issue: Volume:2022 Page:1-14
ISSN:1687-9619
Container-title:Journal of Robotics
language:en
Short-container-title:Journal of Robotics

Author:

Ma Guoqing¹^ORCID,Wang Zhifu²^ORCID,Yuan Xianfeng¹^ORCID,Zhou Fengyu²^ORCID

Affiliation:

1. School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264209, China

2. Control Science and Engineering, Shandong University, Jinan 250061, China

Abstract

Deep reinforcement learning is the technology of artificial neural networks in the field of decision-making and control. The traditional model-free reinforcement learning algorithm requires a large amount of environment interactive data to iterate the algorithm. This model’s performance also suffers due to low utilization of training data, while the model-based reinforcement learning (MBRL) algorithm improves the efficiency of the data, MBRL locks into low prediction accuracy. Although MBRL can utilize the additional data generated by the dynamic model, a system dynamics model with low prediction accuracy will provide low-quality data and affect the algorithm’s final result. In this paper, based on the A3C (Asynchronous Advantage Actor-Critic) algorithm, an improved model-based deep reinforcement learning algorithm using a learning degree network (MBRL-LDN) is presented. By comparing the differences between the predicted states outputted by the proposed multidynamic model and the original predicted states, the learning degree of the system dynamics model is calculated. The learning degree represents the quality of the data generated by the dynamic model and is used to decide whether to continue to interact with the dynamic model during a particular episode. Thus, low-quality data will be discarded. The superiority of the proposed method is verified by conducting extensive contrast experiments.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

General Computer Science,Control and Systems Engineering

Link

http://downloads.hindawi.com/journals/jr/2022/7169594.pdf

Reference27 articles.

1. Deep Deterministic Policy Gradient (DDPG)-Based Energy Harvesting Wireless Communications

2. Mastering the game of Go with deep neural networks and tree search

3. Model-Free Reinforcement Learning of Minimal-Cost Variance Control

4. Deterministic diagnostic pattern generation (DDPG) for compound defects;F. Wang

5. A theoretical analysis of deep Q-learning;J. Fan;Proceedings of the Learning for Dynamics and Control,2020

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Simulated Autonomous Driving Using Reinforcement Learning: A Comparative Study on Unity’s ML-Agents Framework;Information;2023-05-14

2. Deep Neural Networks and Smooth Approximation of PDEs;Computational Science – ICCS 2022;2022