How to train a self-driving vehicle: On the added value (or lack thereof) of curriculum learning and replay buffers-Reference-Cited by-同舟云学术

How to train a self-driving vehicle: On the added value (or lack thereof) of curriculum learning and replay buffers

Published:2023-01-25 Issue: Volume:6 Page:
ISSN:2624-8212
Container-title:Frontiers in Artificial Intelligence
language:
Short-container-title:Front. Artif. Intell.

Author:

Mahmoud Sara,Billing Erik,Svensson Henrik,Thill Serge

Abstract

Learning from only real-world collected data can be unrealistic and time consuming in many scenario. One alternative is to use synthetic data as learning environments to learn rare situations and replay buffers to speed up the learning. In this work, we examine the hypothesis of how the creation of the environment affects the training of reinforcement learning agent through auto-generated environment mechanisms. We take the autonomous vehicle as an application. We compare the effect of two approaches to generate training data for artificial cognitive agents. We consider the added value of curriculum learning—just as in human learning—as a way to structure novel training data that the agent has not seen before as well as that of using a replay buffer to train further on data the agent has seen before. In other words, the focus of this paper is on characteristics of the training data rather than on learning algorithms. We therefore use two tasks that are commonly trained early on in autonomous vehicle research: lane keeping and pedestrian avoidance. Our main results show that curriculum learning indeed offers an additional benefit over a vanilla reinforcement learning approach (using Deep-Q Learning), but the replay buffer actually has a detrimental effect in most (but not all) combinations of data generation approaches we considered here. The benefit of curriculum learning does depend on the existence of a well-defined difficulty metric with which various training scenarios can be ordered. In the lane-keeping task, we can define it as a function of the curvature of the road, in which the steeper and more occurring curves on the road, the more difficult it gets. Defining such a difficulty metric in other scenarios is not always trivial. In general, the results of this paper emphasize both the importance of considering data characterization, such as curriculum learning, and the importance of defining an appropriate metric for the task.

Funder

Horizon 2020

Publisher

Frontiers Media SA

Subject

Artificial Intelligence

Reference48 articles.

1. An end-to-end curriculum learning approach for autonomous driving scenarios;Anzalone;IEEE Trans. Intell. Transp. Syst.,2022

2. Curriculum learning for vehicle lateral stability estimations;Bae;IEEE Access,2021

3. “Curriculum learning,”;Bengio;Proceedings of the 26th Annual International Conference on Machine Learning,2009

4. “Progressive reinforcement learning with distillation for multi-skilled motion control,”;Berseth;International Conference on Learning Representations,2018

5. Finding your way from the bed to the kitchen: reenacting and recombining sensorimotor episodes learned from human demonstration;Billing;Front. Robot. AI,2016

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Intelligent Identification of Moving Trajectory of Autonomous Vehicle Based on Friction Nano-Generator;IEEE Transactions on Intelligent Transportation Systems;2023