Learning Fixed Points of Recurrent Neural Networks by Reparameterizing the Network Model-Reference-Cited by-同舟云学术

Learning Fixed Points of Recurrent Neural Networks by Reparameterizing the Network Model

Published:2024-07-19 Issue:8 Volume:36 Page:1568-1600
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:

Author:

Zhu Vicky¹,Rosenbaum Robert²

Affiliation:

1. Babson College, Mathematics, Analytics, Science, and Technology Division, Wellesley, MA 02481, U.S.A. vzhu@babson.edu

2. University of Notre Dame, Department of Applied and Computational Mathematics and Statistics, Notre Dame, IN 46556, U.S.A. Robert.Rosenbaum@nd.edu

Abstract

Abstract In computational neuroscience, recurrent neural networks are widely used to model neural activity and learning. In many studies, fixed points of recurrent neural networks are used to model neural responses to static or slowly changing stimuli, such as visual cortical responses to static visual stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. In parallel, training fixed points is a central topic in the study of deep equilibrium models in machine learning. A natural approach is to use gradient descent on the Euclidean space of weights. We show that this approach can lead to poor learning performance due in part to singularities that arise in the loss surface. We use a reparameterization of the recurrent network model to derive two alternative learning rules that produce more robust learning dynamics. We demonstrate that these learning rules avoid singularities and learn more effectively than standard gradient descent. The new learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that learning in the brain should be expected to follow the negative Euclidean gradient of synaptic weights.

Publisher

MIT Press

Link

https://direct.mit.edu/neco/article-pdf/36/8/1568/2462319/neco_a_01681.pdf

Reference55 articles.

1. A learning rule for asynchronous perceptrons with feedback in a combinatorial environment;Almeida,1990

2. Natural gradient works efficiently in learning;Amari;Neural Computation,1998

3. Why natural gradient?;Amari,1998

4. Deep equilibrium models;Bai,2019