Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity-Reference-Cited by-同舟云学术

Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity

Published:2007-06 Issue:6 Volume:19 Page:1468-1502
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Florian Răzvan V.¹

Affiliation:

1. Center for Cognitive and Neural Studies (Coneural), 400504 Cluj-Napoca, Romania, and Babeş-Bolyai University, Institute for Interdisciplinary Experimental Research, 400271 Cluj-Napoca, Romania

Abstract

The persistent modification of synaptic efficacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as spike-timing-dependent plasticity (STDP). Here we show that the modulation of STDP by a global reward signal leads to reinforcement learning. We first derive analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intrinsic plasticity, by applying a reinforcement learning algorithm to the stochastic spike response model of spiking neurons. These rules have several features common to plasticity mechanisms experimentally found in the brain. We then demonstrate in simulations of networks of integrate-and-fire neurons the efficacy of two simple learning rules involving modulated STDP. One rule is a direct extension of the standard STDP model (modulated STDP), and the other one involves an eligibility trace stored at each synapse that keeps a decaying memory of the relationships between the recent pairs of pre- and postsynaptic spike pairs (modulated STDP with eligibility trace). This latter rule permits learning even if the reward signal is delayed. The proposed rules are able to solve the XOR problem with both rate coded and temporally coded input and to learn a target output firing-rate pattern. These learning rules are biologically plausible, may be used for training generic artificial spiking neural networks, regardless of the neural model used, and suggest the experimental investigation in animals of the existence of reward-modulated STDP.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/neco.2007.19.6.1468

Reference48 articles.

1. Synaptic plasticity: taming the beast

2. Rapid, synaptically driven increases in the intrinsic excitability of cerebellar deep nuclear neurons

3. Versatility and adaptive performance

4. Pattern-recognizing stochastic learning automata

Cited by 222 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Quantum computing and neuroscience for 6G/7G networks: Survey;Intelligent Systems with Applications;2024-09

2. Tuning Synaptic Connections Instead of Weights by Genetic Algorithm in Spiking Policy Network;Machine Intelligence Research;2024-08-17

3. From Information to Knowledge: A Role for Knowledge Networks in Decision Making and Action Selection;Information;2024-08-15

4. A Burst-Dependent Algorithm for Neuromorphic On-Chip Learning of Spiking Neural Networks;2024-07-23

5. TDSTDP;2024-07-23