Reference Motion Quality and Design Choices for Bipedal Walking with PPO
-
Published:2023-12-07
Issue:
Volume:
Page:
-
ISSN:0219-8436
-
Container-title:International Journal of Humanoid Robotics
-
language:en
-
Short-container-title:Int. J. Human. Robot.
Author:
Bestmann Marc1ORCID,
Zhang Jianwei1ORCID
Affiliation:
1. Department of Informatics, University of Hamburg, Vogt-Kölln-Straße 30, 22527 Hamburg, Germany
Abstract
This paper investigates the influence of reference motion quality and other design choices on the performance of deep reinforcement learning for bipedal walking with Proximate Policy Optimization (PPO). We use parametrized Cartesian quintic splines to generate reference actions for an omnidirectional walk policy. By using parameter sets with different qualities, we show that the performance of the trained policy correlates to the quality of the reference motion. We also show that a policy in Cartesian space performs superior to a joint-space-based one if an advantageous representation of orientation is chosen. Additionally, we show that using an initial bias for the policy speeds up the training and leads to higher performances for policies using position control. We also show that we can achieve a stable omnidirectional walk on a wide variety of simulated humanoid robots.
Funder
Universitat Hamburg
Publisher
World Scientific Pub Co Pte Ltd
Subject
Artificial Intelligence,Mechanical Engineering
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献