Affiliation:
1. Computer Engineering Department, University of Murcia, Murcia, Spain
2. Department of Information Engineering and Mathematics, University of Siena, Siena, Italy
Abstract
This work covers the PHAST Library’s employment, a hardware-agnostic programming library, to a real-world application like the Caffe framework. The original implementation of Caffe consists of two different versions of the source code: one to run on CPU platforms and another one to run on the GPU side. With PHAST, we aim to develop a single-source code implementation capable of running efficiently on CPU and GPU. In this paper, we start by carrying out a basic Caffe implementation performance analysis using PHAST. Then, we detail possible performance upgrades. We find that the overall performance is dominated by few ‘heavy’ layers. In refining the inefficient parts of this version, we find two different approaches: improvements to the Caffe source code and improvements to the PHAST Library itself, which ultimately translates into improved performance in the PHAST version of Caffe. We demonstrate that our PHAST implementation achieves performance portability on CPUs and GPUs. With a single source, the PHAST version of Caffe provides the same or even better performance than the original version of Caffe built from two different codebases. For the MNIST database, the PHAST implementation takes an equivalent amount of time as native code in CPU and GPU. Furthermore, PHAST achieves a speedup of 51% and a 49% with the CIFAR-10 database against native code in CPU and GPU, respectively. These results provide a new horizon for software development in the upcoming heterogeneous computing era.
Funder
European Regional Development Fund
Subject
Hardware and Architecture,Theoretical Computer Science,Software
Reference40 articles.
1. Adve S, Bodik R (2019) I-USHER: Interfaces to Unlock the Specialized Hardware Revolution. Information Science and Technology (ISAT), p. 27. URL http://rsim.cs.illinois.edu/Talks/I-USHER.pdf.
2. Aksel Alpay (2019) hipSYCL - an implementation of SYCL over NVIDIA CUDA/AMD HIP. URL https://github.com/illuhad/hipSYCL
3. Programming languages for data-Intensive HPC applications: A systematic mapping study
4. An updated set of basic linear algebra subprograms (BLAS)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献