Signal Novelty Detection as an Intrinsic Reward for Robotics-Reference-Cited by-同舟云学术

Signal Novelty Detection as an Intrinsic Reward for Robotics

Published:2023-04-14 Issue:8 Volume:23 Page:3985
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Kubovčík Martin¹^ORCID,Dirgová Luptáková Iveta¹,Pospíchal Jiří¹^ORCID

Affiliation:

1. Department of Applied Informatics, Faculty of Natural Sciences, University of Ss. Cyril and Methodius, J. Herdu 2, 917 01 Trnava, Slovakia

Abstract

In advanced robot control, reinforcement learning is a common technique used to transform sensor data into signals for actuators, based on feedback from the robot’s environment. However, the feedback or reward is typically sparse, as it is provided mainly after the task’s completion or failure, leading to slow convergence. Additional intrinsic rewards based on the state visitation frequency can provide more feedback. In this study, an Autoencoder deep learning neural network was utilized as novelty detection for intrinsic rewards to guide the search process through a state space. The neural network processed signals from various types of sensors simultaneously. It was tested on simulated robotic agents in a benchmark set of classic control OpenAI Gym test environments (including Mountain Car, Acrobot, CartPole, and LunarLander), achieving more efficient and accurate robot control in three of the four tasks (with only slight degradation in the Lunar Lander task) when purely intrinsic rewards were used compared to standard extrinsic rewards. By incorporating autoencoder-based intrinsic rewards, robots could potentially become more dependable in autonomous operations like space or underwater exploration or during natural disaster response. This is because the system could better adapt to changing environments or unexpected situations.

Funder

Cultural and Educational Grant Agency MŠVVaŠ SR

Erasmus+ project FAAI: The Future is in Applied Artificial Intelligence

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/8/3985/pdf

Reference57 articles.

1. Pathak, D., Agrawal, P., Efros, A.A., and Darrell, T. (2017, January 6–11). Curiosity-Driven Exploration by Self-Supervised Prediction. Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia. Available online: https://arxiv.org/pdf/1705.05363.pdf.

2. Burda, Y., Edwards, H., Storkey, A., and Klimov, O. (2023, March 07). Exploration by Random Network Distillation. Available online: https://arxiv.org/abs/1810.12894.

3. Bellemare, M., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., and Munos, R. (2016). Unifying Count-Based Exploration and Intrinsic Motivation. Adv. Neural Inf. Process. Syst., 29, Available online: https://arxiv.org/abs/1606.01868.

4. Tang, H., Houthooft, R., Foote, D., Stooke, A., Xi Chen, O., Duan, Y., Schulman, J., DeTurck, F., and Abbeel, P. (2017). # Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning. Adv. Neural Inf. Process. Syst., 30, Available online: https://arxiv.org/pdf/1611.04717.pdf.

5. Intrinsic motivation systems for autonomous mental development;Oudeyer;IEEE Trans. Evol. Comput.,2007

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Accurate Kinematic Modeling using Autoencoders on Differentiable Joints;2024 IEEE International Conference on Robotics and Automation (ICRA);2024-05-13

2. Improved Robot Path Planning Method Based on Deep Reinforcement Learning;Sensors;2023-06-15