Self-Learning Robot Autonomous Navigation with Deep Reinforcement Learning Techniques

Author:

Pintos Gómez de las Heras Borja1ORCID,Martínez-Tomás Rafael1ORCID,Cuadra Troncoso José Manuel1ORCID

Affiliation:

1. Department of Artificial Intelligence, National Distance Education University, Juan del Rosal 16, 28040 Madrid, Spain

Abstract

Complex and high-computational-cost algorithms are usually the state-of-the-art solution for autonomous driving cases in which non-holonomic robots must be controlled in scenarios with spatial restrictions and interaction with dynamic obstacles while fulfilling at all times safety, comfort, and legal requirements. These highly complex software solutions must cover the high variability of use cases that might appear in traffic conditions, especially when involving scenarios with dynamic obstacles. Reinforcement learning algorithms are seen as a powerful tool in autonomous driving scenarios since the complexity of the algorithm is automatically learned by trial and error with the help of simple reward functions. This paper proposes a methodology to properly define simple reward functions and come up automatically with a complex and successful autonomous driving policy. The proposed methodology has no motion planning module so that the computational power can be limited like in the reactive robotic paradigm. Reactions are learned based on the maximization of the cumulative reward obtained during the learning process. Since the motion is based on the cumulative reward, the proposed algorithm is not bound to any embedded model of the robot and is not being affected by uncertainties of these models or estimators, making it possible to generate trajectories with the consideration of non-holonomic constrains. This paper explains the proposed methodology and discusses the setup of experiments and the results for the validation of the methodology in scenarios with dynamic obstacles. A comparison between the reinforcement learning algorithm and state-of-the-art approaches is also carried out to highlight how the methodology proposed outperforms state-of-the-art algorithms.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Reference28 articles.

1. Warren, C.W. (1990, January 13–18). Multiple robot path coordination using artificial potential fields. Proceedings of the IEEE International Conference on Robotics and Automation, Cincinnati, OH, USA.

2. Motion Planning in Dynamic Environments Using Velocity Obstacles;Fiorini;Int. J. Robot. Res.,1998

3. The dynamic window approach to collision avoidance;Fox;IEEE Robot. Autom. Mag.,1997

4. Reactive navigation in real environments using partial center of area method;Troncoso;Robot. Auton. Syst.,2010

5. Tobaruela, J.A., and Rodríguez, A.O. (2017). Rodríguez Reactive navigation in extremely dense and highly intricate environments. PLoS ONE, 12.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3