Signal Novelty Detection as an Intrinsic Reward for Robotics

Author:

Kubovčík Martin1ORCID,Dirgová Luptáková Iveta1,Pospíchal Jiří1ORCID

Affiliation:

1. Department of Applied Informatics, Faculty of Natural Sciences, University of Ss. Cyril and Methodius, J. Herdu 2, 917 01 Trnava, Slovakia

Abstract

In advanced robot control, reinforcement learning is a common technique used to transform sensor data into signals for actuators, based on feedback from the robot’s environment. However, the feedback or reward is typically sparse, as it is provided mainly after the task’s completion or failure, leading to slow convergence. Additional intrinsic rewards based on the state visitation frequency can provide more feedback. In this study, an Autoencoder deep learning neural network was utilized as novelty detection for intrinsic rewards to guide the search process through a state space. The neural network processed signals from various types of sensors simultaneously. It was tested on simulated robotic agents in a benchmark set of classic control OpenAI Gym test environments (including Mountain Car, Acrobot, CartPole, and LunarLander), achieving more efficient and accurate robot control in three of the four tasks (with only slight degradation in the Lunar Lander task) when purely intrinsic rewards were used compared to standard extrinsic rewards. By incorporating autoencoder-based intrinsic rewards, robots could potentially become more dependable in autonomous operations like space or underwater exploration or during natural disaster response. This is because the system could better adapt to changing environments or unexpected situations.

Funder

Cultural and Educational Grant Agency MŠVVaŠ SR

Erasmus+ project FAAI: The Future is in Applied Artificial Intelligence

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Reference57 articles.

1. Pathak, D., Agrawal, P., Efros, A.A., and Darrell, T. (2017, January 6–11). Curiosity-Driven Exploration by Self-Supervised Prediction. Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia. Available online: https://arxiv.org/pdf/1705.05363.pdf.

2. Burda, Y., Edwards, H., Storkey, A., and Klimov, O. (2023, March 07). Exploration by Random Network Distillation. Available online: https://arxiv.org/abs/1810.12894.

3. Bellemare, M., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., and Munos, R. (2016). Unifying Count-Based Exploration and Intrinsic Motivation. Adv. Neural Inf. Process. Syst., 29, Available online: https://arxiv.org/abs/1606.01868.

4. Tang, H., Houthooft, R., Foote, D., Stooke, A., Xi Chen, O., Duan, Y., Schulman, J., DeTurck, F., and Abbeel, P. (2017). # Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning. Adv. Neural Inf. Process. Syst., 30, Available online: https://arxiv.org/pdf/1611.04717.pdf.

5. Intrinsic motivation systems for autonomous mental development;Oudeyer;IEEE Trans. Evol. Comput.,2007

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Accurate Kinematic Modeling using Autoencoders on Differentiable Joints;2024 IEEE International Conference on Robotics and Automation (ICRA);2024-05-13

2. Improved Robot Path Planning Method Based on Deep Reinforcement Learning;Sensors;2023-06-15

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3