Automated Hyperparameter Tuning in Reinforcement Learning for Quadrupedal Robot Locomotion-Reference-Cited by-同舟云学术

Automated Hyperparameter Tuning in Reinforcement Learning for Quadrupedal Robot Locomotion

Published:2023-12-27 Issue:1 Volume:13 Page:116
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Kim MyeongSeop¹^ORCID,Kim Jung-Su²^ORCID,Park Jae-Han¹^ORCID

Affiliation:

1. Applied Robot R&D Department, Korea Institute of Industrial Technology (KITECH), Ansan 15588, Republic of Korea

2. Department of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea

Abstract

In reinforcement learning, the reward function has a significant impact on the performance of the agent. However, determining the appropriate value of this reward function requires many attempts and trials. Although many automated reinforcement learning methods have been proposed to find an appropriate reward function, their proof is lacking in complex environments such as quadrupedal locomotion. In this paper, we propose a method to automatically tune the scale of the dominant reward functions in reinforcement learning of a quadrupedal robot. Reinforcement learning of the quadruped robot is very sensitive to the reward function, and recent outstanding research results have put a lot of effort into reward shaping. In this paper, we propose an automated reward shaping method that automatically adjusts the reward function scale appropriately. We select some dominant reward functions, arrange their weights in a certain unit, and then calculate their gait scores so that we can select the agent with the highest score. This gait score was defined to reflect the stable walking of the quadrupedal robot. Additionally, quadrupedal locomotion learning requires reward functions of different scales depending on the robot’s size and shape. Therefore, we evaluate the performance of the proposed method on two different robots.

Funder

Korea Research Institute for defense Technology planning and advancement

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/13/1/116/pdf

Reference25 articles.

1. Rudin, N., Hoeller, D., Reist, P., and Hutter, M. (2022, January 14–18). Learning to walk in minutes using massively parallel deep reinforcement learning. Proceedings of the Conference on Robot Learning, Auckland, New Zealand.

2. Learning quadrupedal locomotion over challenging terrain;Lee;Sci. Robot.,2020

3. Learning agile and dynamic motor skills for legged robots;Hwangbo;Sci. Robot.,2019

4. Automated reinforcement learning (autorl): A survey and open problems;Rajan;J. Artif. Intell. Res.,2022

5. Frank, H., Kotthoff, L., and Vanschoren, J. (2019). Automated Machine Learning: Methods, Systems, Challenges, Springer Nature.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. AutoRL-Sim: Automated Reinforcement Learning Simulator for Combinatorial Optimization Problems;Modelling;2024-08-26

2. AUV Obstacle Avoidance Framework Based on Event-Triggered Reinforcement Learning;Electronics;2024-05-23