An Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks-Reference-Cited by-同舟云学术

An Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks

Published:2020-12-03 Issue:1 Volume:11 Page:33-50
ISSN:2083-2567
Container-title:Journal of Artificial Intelligence and Soft Computing Research
language:en
Short-container-title:

Author:

Zini Julia El¹,Rizk Yara¹,Awad Mariette¹

Affiliation:

1. Department of Electrical and Computer Engineering , American University of Beirut

Abstract

Abstract Recurrent neural networks (RNN) have been successfully applied to various sequential decision-making tasks, natural language processing applications, and time-series predictions. Such networks are usually trained through back-propagation through time (BPTT) which is prohibitively expensive, especially when the length of the time dependencies and the number of hidden neurons increase. To reduce the training time, extreme learning machines (ELMs) have been recently applied to RNN training, reaching a 99% speedup on some applications. Due to its non-iterative nature, ELM training, when parallelized, has the potential to reach higher speedups than BPTT. In this work, we present Opt-PR-ELM, an optimized parallel RNN training algorithm based on ELM that takes advantage of the GPU shared memory and of parallel QR factorization algorithms to efficiently reach optimal solutions. The theoretical analysis of the proposed algorithm is presented on six RNN architectures, including LSTM and GRU, and its performance is empirically tested on ten time-series prediction applications. Opt-PR-ELM is shown to reach up to 461 times speedup over its sequential counterpart and to require up to 20x less time to train than parallel BPTT. Such high speedups over new generation CPUs are extremely crucial in real-time applications and IoT environments.

Publisher

Walter de Gruyter GmbH

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Hardware and Architecture,Modeling and Simulation,Information Systems

Link

https://www.sciendo.com/pdf/10.2478/jaiscr-2021-0003

Reference44 articles.

1. [1] Yoshua Bengio, Patrice Simard, Paolo Frasconi, et al. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157–166, 1994.

2. [2] Stephen A Billings. Nonlinear system identification: NARMAX methods in the time, frequency, and spatio-temporal domains. John Wiley & Sons, 2013.

3. [3] Armando Blanco, Miguel Delgado, and Maria C Pegalajar. A real-coded genetic algorithm for training recurrent neural networks. Neural networks, 14(1):93–105, 2001.

4. [4] Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.

5. [5] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.

Cited by 19 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Review on Large-Scale Data Processing with Parallel and Distributed Randomized Extreme Learning Machine Neural Networks;Mathematical and Computational Applications;2024-05-27

2. Diarec: Dynamic Intention-Aware Recommendation with Attention-Based Context-Aware Item Attributes Modeling;Journal of Artificial Intelligence and Soft Computing Research;2024-03-01

3. Transformers in Time-Series Analysis: A Tutorial;Circuits, Systems, and Signal Processing;2023-07-25

4. A New Approach to Statistical Iterative Reconstruction Algorithm for a CT Scanner with Flying Focal Spot Using a Rebinning Method;Artificial Intelligence and Soft Computing;2023

5. Employee Turnover Prediction From Email Communication Analysis;Artificial Intelligence and Soft Computing;2023