A Dynamic Adjusting Reward Function Method for Deep Reinforcement Learning with Adjustable Parameters-Reference-Cited by-同舟云学术

A Dynamic Adjusting Reward Function Method for Deep Reinforcement Learning with Adjustable Parameters

Published:2019-11-23 Issue: Volume:2019 Page:1-10
ISSN:1024-123X
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Hu Zijian¹^ORCID,Wan Kaifang¹^ORCID,Gao Xiaoguang¹^ORCID,Zhai Yiwei¹^ORCID

Affiliation:

1. School of Electronic and Information, Northwestern Polytechnical University, Xi’an 710129, China

Abstract

In deep reinforcement learning, network convergence speed is often slow and easily converges to local optimal solutions. For an environment with reward saltation, we propose a magnify saltatory reward (MSR) algorithm with variable parameters from the perspective of sample usage. MSR dynamically adjusts the rewards for experience with reward saltation in the experience pool, thereby increasing an agent’s utilization of these experiences. We conducted experiments in a simulated obstacle avoidance search environment of an unmanned aerial vehicle and compared the experimental results of deep Q-network (DQN), double DQN, and dueling DQN after adding MSR. The experimental results demonstrate that, after adding MSR, the algorithms exhibit a faster network convergence and can obtain the global optimal solution easily.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2019/7619483.pdf

Reference20 articles.

1. Q-learning

Cited by 27 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Markov game based on reinforcement learning solution against cyber–physical attacks in smart grid;Expert Systems with Applications;2024-12

2. Adversarial Proximal Policy Optimisation for Robust Reinforcement Learning;AIAA SCITECH 2024 Forum;2024-01-04

3. RL-ECGNet: resource-aware multi-class detection of arrhythmia through reinforcement learning;Applied Intelligence;2023-11-29

4. Obstacle Avoidance System on Autonomous Car Using D3QN;2023 14th International Conference on Information & Communication Technology and System (ICTS);2023-10-04

5. Online computation offloading via deep convolutional feature map attention reinforcement learning and adaptive rewarding policy;Wireless Networks;2023-07-07