Conditional computation in neural networks: Principles and research trends-Reference-Cited by-同舟云学术

Conditional computation in neural networks: Principles and research trends

Published:2024-07-31 Issue:1 Volume:18 Page:175-190
ISSN:1724-8035
Container-title:Intelligenza Artificiale
language:
Short-container-title:IA

Author:

Scardapane Simone¹,Baiocchi Alessandro²,Devoto Alessio²,Marsocci Valerio³,Minervini Pasquale⁴,Pomponi Jary¹

Affiliation:

1. Dipartimento di Ingegneria dell’Informazione, Elettronica e Telecomunicazioni (DIET), Sapienza University of Rome, Rome, Italy

2. Department of Computer, Control, and Management Engineering Antonio Ruberti (DIAG), Rome, Italy

3. Geomatics Research Group, KU Leuven, Gent, Belgium

4. School of Informatics, University of Edinburgh, Edinburgh, UK

Abstract

This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input. Examples include the dynamic selection of, e.g., input tokens, layers (or sets of layers), and sub-modules inside each layer (e.g., channels in a convolutional filter). We first provide a general formalism to describe these techniques in an uniform way. Then, we introduce three notable implementations of these principles: mixture-of-experts (MoEs) networks, token selection mechanisms, and early-exit neural networks. The paper aims to provide a tutorial-like introduction to this growing field. To this end, we analyze the benefits of these modular designs in terms of efficiency, explainability, and transfer learning, with a focus on emerging applicative areas ranging from automated scientific discovery to semantic communication.

Publisher

IOS Press

Reference115 articles.

1. Neural modulenetworks;Andreas;Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2016

2. Composable Sparse Fine-Tuning for Cross-Lingual Transfer

3. Improving the accuracy of early exits in multi-exit architectures via curriculum learning;Bakhtiarnia;Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN)

4. Conditional computation in neural networks for faster models;Bengio;arXiv preprint arXiv:1511.06297