Abstract
AbstractWe address the efficient realization of matrix multiplication (gemm), with application in the convolution operator for machine learning, for the RISC-V core present in the GreenWaves GAP8 processor. Our approach leverages BLIS (Basic Linear Algebra Instantiation Software) to develop an implementation that (1) re-organizes the gemm algorithm adapting its micro-kernel to exploit the hardware-supported dot product kernel in the GAP8; (2) explicitly orchestrates the data transfers across the hierarchy of scratchpad memories via DMA (direct memory access); and (3) operates with integer arithmetic.
Funder
Ministerio de Ciencia, Innovación y Universidades
Generalitat Valenciana
Ministerio de Ciencia, Innovacióón y Universidades
Publisher
Springer Science and Business Media LLC
Subject
Hardware and Architecture,Information Systems,Theoretical Computer Science,Software
Reference20 articles.
1. Hazelwood K et al (2018) Applied machine learning at Facebook: a datacenter infrastructure perspective. In: IEEE International Symposium on High Performance Computer Architecture, pp 620–629
2. Park J et al (2018) Deep learning inference in Facebook data centers: characterization, performance optimizations and hardware implications. arXiv:1811.09886
3. Wu C et al (2019) Machine learning at Facebook: understanding inference at the edge. In: International Symposium on High Performance Computer Architecture, pp 331–344
4. Yi S, Li C, Li Q (2015) A survey of fog computing: concepts, applications and issues. In: Proceedings of the 2015 Workshop on Mobile Big Data, ser. Mobidata’15, pp 37–42
5. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS), vol 1, pp 1097–1105
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献