Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies-Reference-Cited by-同舟云学术

Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies

Published:2019-07-17 Issue: Volume:33 Page:3280-3287
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Chandar Sarath,Sankar Chinnadhurai,Vorontsov Eugene,Kahou Samira Ebrahimi,Bengio Yoshua

Abstract

Modelling long-term dependencies is a challenge for recurrent neural networks. This is primarily due to the fact that gradients vanish during training, as the sequence length increases. Gradients can be attenuated by transition operators and are attenuated or dropped by activation functions. Canonical architectures like LSTM alleviate this issue by skipping information through a memory mechanism. We propose a new recurrent architecture (Non-saturating Recurrent Unit; NRU) that relies on a memory mechanism but forgoes both saturating activation functions and saturating gates, in order to further alleviate vanishing gradients. In a series of synthetic and real world tasks, we demonstrate that the proposed model is the only model that performs among the top 2 models across all tasks with and without long-term dependencies, when compared against a range of other architectures.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Residual Echo State Networks: Residual recurrent neural networks with stable dynamics and fast learning;Neurocomputing;2024-09

2. Can Machine Learn Pipeline Leakage?;2024 Design, Automation & Test in Europe Conference & Exhibition (DATE);2024-03-25

3. A Study on the Health Index Based on Degradation Patterns in Time Series Data Using ProphetNet Model;Journal of Society of Korea Industrial and Systems Engineering;2023-09-30

4. Towards a better consideration of rainfall and hydrological spatial features by a deep neural network model to improve flash floods forecasting: case study on the Gardon basin, France;Modeling Earth Systems and Environment;2023-01-09

5. Effective short-term forecasts of Saudi stock price trends using technical indicators and large-scale multivariate time series;PeerJ Computer Science;2023-01-06