A novel model-based reinforcement learning algorithm for solving the problem of unbalanced reward

Author:

Yuan Yinlong1,Hua Liang1,Cheng Yun1,Li Junhong1,Sang Xiaohu1,Zhang Lei1,Wei Wu2

Affiliation:

1. Department of College of Electrical Engineering, Nantong University, Nantong, China

2. Department of College of Automation Science and Engineering, South China University of Technology, Guangzhou, China

Abstract

Reward signal reinforcement learning algorithms can be used to solve sequential learning problems. However, in practice, they still suffer from the problem of reward imbalance, which limits their use in many contexts. To solve this unbalanced reward problem, in this paper, we propose a novel model-based reinforcement learning algorithm called the expected n-step value iteration (EnVI). Unlike traditional model-based reinforcement learning algorithms, the proposed method uses a new return function that changes the discount of future rewards while reducing the influence of the current reward. We evaluated the performance of the proposed algorithm on a Treasure-Hunting game and a Hill-Walking game. The results demonstrate that the proposed algorithm can reduce the negative impact of unbalanced rewards and greatly improve the performance of traditional reinforcement learning algorithms.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference12 articles.

1. Sutton R. and Barto A. , Introduction to rinforcement learning, MIT Press, 2017.

2. A hierarchical self-attentiveneural extractive summarizer via reinforcement learning;Mohsen;Applied Intelligence,2020

3. Bench calibration method forautomotive electric motors based on deep reinforcement learning,1–;Zhou;Journal of Intelligent and Fuzzy Systems,2020

4. Deep learning;Lecun;Nature,2015

5. Deep learning in neural networks: an overview;Schmidhuber;Neural Networks,2015

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3