Can neural networks benefit from objectives that encourage iterative convergent computations? A case study of ResNets and object classification-Reference-Cited by-同舟云学术

Can neural networks benefit from objectives that encourage iterative convergent computations? A case study of ResNets and object classification

Published:2024-03-21 Issue:3 Volume:19 Page:e0293440
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Lippl Samuel^ORCID,Peters Benjamin^ORCID,Kriegeskorte Nikolaus

Abstract

Recent work has suggested that feedforward residual neural networks (ResNets) approximate iterative recurrent computations. Iterative computations are useful in many domains, so they might provide good solutions for neural networks to learn. However, principled methods for measuring and manipulating iterative convergence in neural networks remain lacking. Here we address this gap by 1) quantifying the degree to which ResNets learn iterative solutions and 2) introducing a regularization approach that encourages the learning of iterative solutions. Iterative methods are characterized by two properties: iteration and convergence. To quantify these properties, we define three indices of iterative convergence. Consistent with previous work, we show that, even though ResNets can express iterative solutions, they do not learn them when trained conventionally on computer-vision tasks. We then introduce regularizations to encourage iterative convergent computation and test whether this provides a useful inductive bias. To make the networks more iterative, we manipulate the degree of weight sharing across layers using soft gradient coupling. This new method provides a form of recurrence regularization and can interpolate smoothly between an ordinary ResNet and a “recurrent” ResNet (i.e., one that uses identical weights across layers and thus could be physically implemented with a recurrent network computing the successive stages iteratively across time). To make the networks more convergent we impose a Lipschitz constraint on the residual functions using spectral normalization. The three indices of iterative convergence reveal that the gradient coupling and the Lipschitz constraint succeed at making the networks iterative and convergent, respectively. To showcase the practicality of our approach, we study how iterative convergence impacts generalization on standard visual recognition tasks (MNIST, CIFAR-10, CIFAR-100) or challenging recognition tasks with partial occlusions (Digitclutter). We find that iterative convergent computation, in these tasks, does not provide a useful inductive bias for ResNets. Importantly, our approach may be useful for investigating other network architectures and tasks as well and we hope that our study provides a useful starting point for investigating the broader question of whether iterative convergence can help neural networks in their generalization.

Funder

H2020 Marie Skłodowska-Curie Actions

National Science Foundation

Gatsby Charitable Foundation

Simons Foundation

Publisher

Public Library of Science (PLoS)

Reference68 articles.

1. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770–778.

2. Training very deep networks;RK Srivastava;Advances in neural information processing systems,2015

3. Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision;CJ Spoerer;PLOS Computational Biology,2020

4. Computational vision and regularization theory;T Poggio;Nature,1985