Affiliation:
1. AMD Research Advanced, Micro Devices, Inc.
2. Computer Sciences Department, University of Wisconsin-Madison
Abstract
With the end of Dennard scaling, architects have increasingly turned to special-purpose hardware accelerators to improve the performance and energy efficiency for some applications. Unfortunately, accelerators don't always live up to their expectations and may under-perform in some situations. Understanding the factors which effect the performance of an accelerator is crucial for both architects and programmers early in the design stage. Detailed models can be highly accurate, but often require low-level details which are not available until late in the design cycle. In contrast, simple analytical models can provide useful insights by abstracting away low-level system details.
In this paper, we propose LogCA---a high-level performance model for hardware accelerators. LogCA helps both programmers and architects identify performance bounds and design bottlenecks early in the design cycle, and provide insight into which optimizations may alleviate these bottlenecks. We validate our model across a variety of kernels, ranging from sub-linear to super-linear complexities on both on-chip and off-chip accelerators. We also describe the utility of LogCA using two retrospective case studies. First, we discuss the evolution of interface design in SUN/Oracle's encryption accelerators. Second, we discuss the evolution of memory interface design in three different GPU architectures. In both cases, we show that the adopted design optimizations for these machines are similar to LogCA's suggested optimizations. We argue that architects and programmers can use insights from these retrospective studies for improving future designs.
Funder
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Reference54 articles.
1. Advanced Micro Devices 2016. APP SDK - A Complete Development Platform. Advanced Micro Devices. http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/. Advanced Micro Devices 2016. APP SDK - A Complete Development Platform. Advanced Micro Devices. http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/.
2. Validity of the single processor approach to achieving large scale computing capabilities
3. Dan Anderson. 2012. How to tell if SPARC T4 crypto is being used? https://blogs.oracle.com/DanX/entry/how_to_tell_if_sparc. Dan Anderson. 2012. How to tell if SPARC T4 crypto is being used? https://blogs.oracle.com/DanX/entry/how_to_tell_if_sparc.
4. A view of the parallel computing landscape
5. Cache Calculus: Modeling Caches through Differential Equations;Beckmann Nathan;Computer Architecture Letters PP,2016
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献