Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics-Reference-Cited by-同舟云学术

Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics

Published:2022-06-24 Issue: Volume:16 Page:
ISSN:1662-5218
Container-title:Frontiers in Neurorobotics
language:
Short-container-title:Front. Neurorobot.

Author:

Massi Elisa,Barthélemy Jeanne,Mailly Juliane,Dromnelle Rémi,Canitrot Julien,Poniatowski Esther,Girard Benoît,Khamassi Mehdi

Abstract

Experience replay is widely used in AI to bootstrap reinforcement learning (RL) by enabling an agent to remember and reuse past experiences. Classical techniques include shuffled-, reversed-ordered- and prioritized-memory buffers, which have different properties and advantages depending on the nature of the data and problem. Interestingly, recent computational neuroscience work has shown that these techniques are relevant to model hippocampal reactivations recorded during rodent navigation. Nevertheless, the brain mechanisms for orchestrating hippocampal replay are still unclear. In this paper, we present recent neurorobotics research aiming to endow a navigating robot with a neuro-inspired RL architecture (including different learning strategies, such as model-based (MB) and model-free (MF), and different replay techniques). We illustrate through a series of numerical simulations how the specificities of robotic experimentation (e.g., autonomous state decomposition by the robot, noisy perception, state transition uncertainty, non-stationarity) can shed new lights on which replay techniques turn out to be more efficient in different situations. Finally, we close the loop by raising new hypotheses for neuroscience from such robotic models of hippocampal replay.

Publisher

Frontiers Media SA

Subject

Artificial Intelligence,Biomedical Engineering

Reference61 articles.

1. Spatial cognition and neuro-mimetic navigation: a model of hippocampal place cell activity;Arleo;Biol. Cybern,2000

2. Prioritized sweeping neural DynaQ with multiple predecessors, and hippocampal replays;Aubin,2018

3. Coherent theta oscillations and reorganization of spike timing in the hippocampal-prefrontal network upon learning;Benchenane;Neuron,2010

4. A biologically inspired meta-control navigation system for the psikharpax rat robot;Caluwaerts;Bioinspiration Biomimet,2012

5. Modern Mathematical Methods for Physicists and Engineers

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An Improved Dyna-Q Algorithm Inspired by the Forward Prediction Mechanism in the Rat Brain for Mobile Robot Path Planning;Biomimetics;2024-05-23

2. Learning While Sleeping: Integrating Sleep-Inspired Consolidation with Human Feedback Learning;2024 IEEE International Conference on Development and Learning (ICDL);2024-05-20

3. A New Paradigm to Study Social and Physical Affordances as Model-Based Reinforcement Learning;2024

4. A new paradigm to study social and physical affordances as model-based reinforcement learning;Cognitive Robotics;2024

5. An immediate-return reinforcement learning for the atypical Markov decision processes;Frontiers in Neurorobotics;2022-12-13