Affiliation:
1. Department of Computer Engineering and Information Technology, Razi University, Kermanshah, Iran
Abstract
Modern GPUs can execute multiple kernels concurrently to keep the hardware resources busy and to boost the overall performance. This approach is called simultaneous multiple kernel execution (MKE). MKE is a promising approach for improving GPU hardware utilization. Although modern GPUs allow MKE, the effects of different MKE scenarios have not adequately studied by the researchers. Since cache memories have significant effects on the overall GPU performance, the effects of MKE on cache performance should be investigated properly. The present study proposes a framework, called RDMKE (short for Reuse Distance-based profiling in MKEs), to provide a method for analyzing GPU cache memory performance in MKE scenarios. The raw memory access information of a kernel is first extracted and then RDMKE enforces a proper ordering to the memory accesses so that it represents a given MKE scenario. Afterward, RDMKE employs reuse distance analysis (RDA) to generate cache-related performance metrics, including hit ratios, transaction counts, cache sets and Miss Status Holding Register reservation fails. In addition, RDMKE provides the user with the RD profiles as a useful locality metric. The simulation results of single kernel executions show a fair correlation between the generated results by RDMKE and GPU performance counters. Further, the simulation results of 28 two-kernel executions indicate that RDMKE can properly capture the nonlinear cache behaviors in MKE scenarios.
Publisher
World Scientific Pub Co Pte Lt
Subject
Electrical and Electronic Engineering,Hardware and Architecture,Electrical and Electronic Engineering,Hardware and Architecture
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献