BrainQN: Enhancing the Robustness of Deep Reinforcement Learning with Spiking Neural Networks

Author:

Feng Shuo1ORCID,Cao Jian1ORCID,Ou Zehong1,Chen Guang1,Zhong Yi2,Wang Zilin2,Yan Juntong1,Chen Jue1,Wang Bingsen1,Zou Chenglong3,Feng Zebang1,Wang Yuan24ORCID

Affiliation:

1. School of Software & Microelectronics Peking University Beijing 102600 China

2. Key Laboratory of Microelectronic Devices and Circuits (MoE) MPW Center School of Integrated Circuits Peking University Beijing 100871 China

3. Peking University Chongqing Research Institute of Big Data Chongqing 400030 China

4. Beijing Advanced Innovation Center for Integrated Circuits Beijing 100871 China

Abstract

As the third‐generation network succeeding artificial neural networks (ANNs), spiking neural networks (SNNs) offer high robustness and low energy consumption. Inspired by biological systems, the limitations of low robustness and high‐power consumption in deep reinforcement learning (DRL) are addressed by introducing SNNs. The Brain Q‐network (BrainQN) is proposed, which replaces the neurons in the classic Deep Q‐learning (DQN) algorithm with SNN neurons. BrainQN is trained using surrogate gradient learning (SGL) and ANN‐to‐SNN conversion methods. Robustness tests with input noise reveal BrainQN's superior performance, achieving an 82.14% increase in rewards under low noise and 71.74% under high noise compared to DQN. These findings highlight BrainQN's robustness and superior performance in noisy environments, supporting its application in complex scenarios. SGL‐trained BrainQN is more robust than ANN‐to‐SNN conversion under high noise. The differences in network output correlations between noisy and original inputs, along with training algorithm distinctions, explain this phenomenon. BrainQN successfully transitioned from a simulated Pong environment to a ball‐catching robot with dynamic vision sensors (DVS). On the neuromorphic chip PAICORE, it shows significant advantages in latency and power consumption compared to Jetson Xavier NX.

Publisher

Wiley

Reference56 articles.

1. Reinforcement renaissance

2. V.Mnih K.Kavukcuoglu D.Silver A.Graves I.Antonoglou D.Wierstra M.Riedmiller   (Preprint) arXiv:1312.5602 v1 Submitted: Dec2013.

3. Human-level control through deep reinforcement learning

4. S. S.Mousavi M.Schukat E.Howley inProc. of SAI Intelligent Systems Conf. (IntelliSys) 2016Springer International Publishing Londonvol.22018 pp.426–440.

5. An Introduction to Deep Reinforcement Learning

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3