Adaptive Evolutionary Reinforcement Learning with Policy Direction-Reference-Cited by-同舟云学术

Adaptive Evolutionary Reinforcement Learning with Policy Direction

Published:2024-02-23 Issue:2 Volume:56 Page:
ISSN:1573-773X
Container-title:Neural Processing Letters
language:en
Short-container-title:Neural Process Lett

Author:

Dong Caibo,Li Dazi

Abstract

AbstractEvolutionary Reinforcement Learning (ERL) has garnered widespread attention in recent years due to its inherent robustness and parallelism. However, the integration of Evolutionary Algorithms (EAs) and Reinforcement Learning (RL) remains relatively rudimentary and lacks dynamism, which can impact the convergence performance of ERL algorithms. In this study, a dynamic adaptive module is introduced to balance the Evolution Strategies (ES) and RL training within ERL. By incorporating elite strategies, this module leverages advantageous individuals to elevate the overall population's performance. Additionally, RL strategy updates often lack guidance from the population. To address this, we incorporate the strategies of the best individuals from the population, providing valuable policy direction. This is achieved through the formulation of a loss function that employs either L1 or L2 regularization to facilitate RL training. The proposed framework is referred to as Adaptive Evolutionary Reinforcement Learning (AERL). The effectiveness of our framework is evaluated by adopting Soft Actor-Critic (SAC) as the RL algorithm and comparing it with other algorithms in the MuJoCo environment. The results underscore the outstanding convergence performance of our proposed Adaptive Evolutionary Soft Actor-Critic (AESAC) algorithm. Furthermore, ablation experiments are conducted to emphasize the necessity of these two improvements. It is worth noting that the enhancements in AESAC are realized at the population level, enabling broader exploration and effectively reducing the risk of falling into local optima.

Funder

National Natural Science Foundation of China

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11063-024-11548-6.pdf

Reference34 articles.

1. Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. The MIT Press, Cambridge, MA, USA

2. Lample G, Chaplot DS (2017) Playing FPS games with deep reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 2140–2146

3. Nguyen H, La H (2019) Review of deep reinforcement learning for robot manipulation. In: 2019 Third IEEE international conference on robotic computing (IRC), pp 590–595

4. Ming Z, Zhang H, Li W, Luo Y (2023) Base on $ Q $-learning Pareto optimality for linear Itô stochastic systems with Markovian jumps. IEEE Trans Autom Sci Eng 1–11

5. Zhang W, Ji M, Yu H, Zhen C (2023) ReLP: reinforcement learning pruning method based on prior knowledge. Neural Process Lett 55(4):4661–4678