Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning and HPC Workloads-Reference-Cited by-同舟云学术

Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning and HPC Workloads

Published:2022-04-18 Issue: Volume:8 Page:
ISSN:2297-4687
Container-title:Frontiers in Applied Mathematics and Statistics
language:
Short-container-title:Front. Appl. Math. Stat.

Author:

Georganas Evangelos,Kalamkar Dhiraj,Avancha Sasikanth,Adelman Menachem,Aggarwal Deepti,Anderson Cristina,Breuer Alexander,Bruestle Jeremy,Chaudhary Narendra,Kundu Abhisek,Kutnick Denise,Laub Frank,Md Vasimuddin,Misra Sanchit,Mohanty Ramanarayan,Pabst Hans,Retford Brian,Ziv Barukh,Heinecke Alexander

Abstract

During the past decade, novel Deep Learning (DL) algorithms, workloads and hardware have been developed to tackle a wide range of problems. Despite the advances in workload and hardware ecosystems, the programming methodology of DL systems is stagnant. DL workloads leverage either highly-optimized, yet platform-specific and inflexible kernels from DL libraries, or in the case of novel operators, reference implementations are built via DL framework primitives with underwhelming performance. This work introduces the Tensor Processing Primitives (TPP), a programming abstraction striving for efficient, portable implementation of DL workloads with high-productivity. TPPs define a compact, yet versatile set of 2D-tensor operators [or a virtual Tensor Instruction Set Architecture (ISA)], which subsequently can be utilized as building-blocks to construct complex operators on high-dimensional tensors. The TPP specification is platform-agnostic, thus, code expressed via TPPs is portable, whereas the TPP implementation is highly-optimized and platform-specific. We demonstrate the efficacy and viability of our approach using standalone kernels and end-to-end DL & High Performance Computing (HPC) workloads expressed entirely via TPPs that outperform state-of-the-art implementations on multiple platforms.

Publisher

Frontiers Media SA

Subject

Applied Mathematics,Statistics and Probability

Reference65 articles.

1. ImageNet classification with deep convolutional neural networks KrizhevskyA SutskeverI HintonGE PereiraF BurgesCJC BurgesL WeinbergerKQ Advances in Neural Information Processing Systems2012

2. Going deeper with convolutions;Szegedy,2015

3. Very deep convolutional networks for large-scale image recognition;Simonyan;arXiv preprint,2014

4. Feature learning in deep neural networks-studies on speech recognition tasks;Yu;arXiv preprint,2013

5. Google's neural machine translation system: bridging the gap between human and machine translation;Wu;arXiv preprint,2016