Moving Learning Machine towards Fast Real-Time Applications: A High-Speed FPGA-Based Implementation of the OS-ELM Training Algorithm-Reference-Cited by-同舟云学术

Moving Learning Machine towards Fast Real-Time Applications: A High-Speed FPGA-Based Implementation of the OS-ELM Training Algorithm

Published:2018-11-07 Issue:11 Volume:7 Page:308
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Frances-Villora Jose V. ,Rosado-Muñoz Alfredo^ORCID,Bataller-Mompean Manuel ,Barrios-Aviles Juan ,Guerrero-Martinez Juan F.

Abstract

Currently, there are some emerging online learning applications handling data streams in real-time. The On-line Sequential Extreme Learning Machine (OS-ELM) has been successfully used in real-time condition prediction applications because of its good generalization performance at an extreme learning speed, but the number of trainings by a second (training frequency) achieved in these continuous learning applications has to be further reduced. This paper proposes a performance-optimized implementation of the OS-ELM training algorithm when it is applied to real-time applications. In this case, the natural way of feeding the training of the neural network is one-by-one, i.e., training the neural network for each new incoming training input vector. Applying this restriction, the computational needs are drastically reduced. An FPGA-based implementation of the tailored OS-ELM algorithm is used to analyze, in a parameterized way, the level of optimization achieved. We observed that the tailored algorithm drastically reduces the number of clock cycles consumed for the training execution up to approximately the 1%. This performance enables high-speed sequential training ratios, such as 14 KHz of sequential training frequency for a 40 hidden neurons SLFN, or 180 Hz of sequential training frequency for a 500 hidden neurons SLFN. In practice, the proposed implementation computes the training almost 100 times faster, or more, than other applications in the bibliography. Besides, clock cycles follows a quadratic complexity O ( N ˜ 2 ) , with N ˜ the number of hidden neurons, and are poorly influenced by the number of input neurons. However, it shows a pronounced sensitivity to data type precision even facing small-size problems, which force to use double floating-point precision data types to avoid finite precision arithmetic effects. In addition, it has been found that distributed memory is the limiting resource and, thus, it can be stated that current FPGA devices can support OS-ELM-based on-chip learning of up to 500 hidden neurons. Concluding, the proposed hardware implementation of the OS-ELM offers great possibilities for on-chip learning in portable systems and real-time applications where frequent and fast training is required.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

http://www.mdpi.com/2079-9292/7/11/308/pdf

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Hardware Implementation of MRO-ELM for Online Sequential Learning on FPGA;Communications in Computer and Information Science;2023-12-23

2. Efficient Compressed Ratio Estimation Using Online Sequential Learning for Edge Computing;2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC);2023-09-05

3. Selecting Language Models Features VIA Software-Hardware Co-Design;ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2023-06-04

4. Emotion Recognition on Edge Devices: Training and Deployment;Sensors;2021-06-30

5. Semi-Supervised Extreme Learning Machine Channel Estimator and Equalizer for Vehicle to Vehicle Communications;Electronics;2021-04-19