Affiliation:
1. Faculty of Computer Science and Information Technology, West Pomeranian University of Technology in Szczecin, Żołnierska 49, 71-210 Szczecin, Poland
Abstract
The paper introduces a range of efficient algorithmic solutions for implementing the fundamental filtering operation in convolutional layers of convolutional neural networks on fully parallel hardware. Specifically, these operations involve computing M inner products between neighbouring vectors generated by a sliding time window from the input data stream and an M-tap finite impulse response filter. By leveraging the factorisation of the Hankel matrix, we have successfully reduced the multiplicative complexity of the matrix-vector product calculation. This approach has been applied to develop fully parallel and resource-efficient algorithms for M values of 3, 5, 7, and 9. The fully parallel hardware implementation of our proposed algorithms achieves approximately a 30% reduction in embedded multipliers compared to the naive calculation methods.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science