Affiliation:
1. Department of Informatics King's College London UK
2. Department of Behavioural and Cognitive Sciences University of Luxembourg Esch‐sur‐Alzette Luxembourg
Abstract
AbstractModelling one‐to‐many type mappings in problems with a temporal component can be challenging. Backpropagation is not applicable to networks that perform discrete sampling and is also susceptible to gradient instabilities, especially when applied to longer sequences. In this paper, we propose two recurrent neural network architectures that leverage stochastic units and mixture models, and are trained with target propagation. We demonstrate that these networks can model complex conditional probability distributions, outperform backpropagation‐trained alternatives, and do not rapidly degrade with increased time horizons. Our main contributions consist of the design and evaluation of the architectures that enable the networks to solve multi‐model problems with a temporal dimension. This also includes the extension of the target propagation through time algorithm to handle stochastic neurons. The use of target propagation provides an additional computational advantage, which enables the network to handle time horizons that are substantially longer compared to networks fitted using backpropagation.
Reference45 articles.
1. GravesA.Sequence transduction with recurrent neural networks. CoRR; abs/1211.3711.2012.
2. MikolovT.Efficient estimation of word representations in vector space. CoRR; abs/1301.3781.2013.
3. LiuP.Recurrent neural network for text classification with multi‐task learning. Proceedings of the Twenty‐Fifth International Joint Conference on Artificial Intelligence IJCAI 2016 New York NY USA 9‐15 July 2016.2016;2873‐2879.
4. YogatamaD.Generative and discriminative text classification with recurrent neural networks. ArXiv e‐Prints.2017.
5. Deep Visual-Semantic Alignments for Generating Image Descriptions