An Advisor-Based Architecture for a Sample-Efficient Training of Autonomous Navigation Agents with Reinforcement Learning-Reference-Cited by-同舟云学术

An Advisor-Based Architecture for a Sample-Efficient Training of Autonomous Navigation Agents with Reinforcement Learning

Published:2023-09-28 Issue:5 Volume:12 Page:133
ISSN:2218-6581
Container-title:Robotics
language:en
Short-container-title:Robotics

Author:

Wijesinghe Rukshan Darshana¹²^ORCID,Tissera Dumindu¹²^ORCID,Vithanage Mihira Kasun²³,Xavier Alex²⁴,Fernando Subha²³^ORCID,Samarawickrama Jayathu¹²

Affiliation:

1. Department of Electronic and Telecommunication Engineering, Faculty of Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka

2. CODEGEN QBITS LAB, University of Moratuwa, Moratuwa 10400, Sri Lanka

3. Department of Computational Mathematics, Faculty of Information Technology, University of Moratuwa, Moratuwa 10400, Sri Lanka

4. Department of Computer Science and Engineering, Faculty of Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka

Abstract

Recent advancements in artificial intelligence have enabled reinforcement learning (RL) agents to exceed human-level performance in various gaming tasks. However, despite the state-of-the-art performance demonstrated by model-free RL algorithms, they suffer from high sample complexity. Hence, it is uncommon to find their applications in robotics, autonomous navigation, and self-driving, as gathering many samples is impractical in real-world hardware systems. Therefore, developing sample-efficient learning algorithms for RL agents is crucial in deploying them in real-world tasks without sacrificing performance. This paper presents an advisor-based learning algorithm, incorporating prior knowledge into the training by modifying the deep deterministic policy gradient algorithm to reduce the sample complexity. Also, we propose an effective method of employing an advisor in data collection to train autonomous navigation agents to maneuver physical platforms, minimizing the risk of collision. We analyze the performance of our methods with the support of simulation and physical experimental setups. Experiments reveal that incorporating an advisor into the training phase significantly reduces the sample complexity without compromising the agent’s performance compared to various benchmark approaches. Also, they show that the advisor’s constant involvement in the data collection process diminishes the agent’s performance, while the limited involvement makes training more effective.

Funder

University of Moratuwa and CodeGen International (Pvt) Ltd under the Q-Bits Scholar grant

Publisher

MDPI AG

Subject

Artificial Intelligence,Control and Optimization,Mechanical Engineering

Link

https://www.mdpi.com/2218-6581/12/5/133/pdf

Reference59 articles.

1. Yang, Y., and Wang, J. (2020). An overview of multi-agent reinforcement learning from game theoretical perspective. arXiv.

2. Lample, G., and Chaplot, D.S. (2017, January 4–9). Playing FPS games with deep reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.

3. Human-level control through deep reinforcement learning;Mnih;Nature,2015

4. Reinforcement learning based recommender systems: A survey;Afsar;ACM Comput. Surv.,2022

5. Reinforcement learning in robotics: A survey;Kober;Int. J. Robot. Res.,2013