Author:
Fang Juan,Zhao Li’ang,Cai Min,Yang Huijing
Abstract
AbstractNormally, threads in a warp do not severely interfere with each other. However, the scheduler must wait until all the threads within complete before scheduling the next warp, resulting in memory divergence. The crux of the problem is scheduling the warp in a more reasonable order. Therefore, we propose a new warp scheduling strategy called WSMP, which is based on multi-level feedback queue (MFQ) and perceptron-based prefetch filtering (PPF). All the warps are sorted beforehand according to the latency tolerance of the warps and pushed into a certain queue in MFQ. We also remold PPF to enhance the modified underlying prefetcher. We are able to strike a balance between cache hit rate and prefetch coverage then. We verify its feasibility using GPGPU-Sim, along with exclusive GPGPU workload. The results show that compared to the baseline, WSMP improves IPC by 26.45% and reduces L2 cache miss rate by 9.54% on average.
Funder
National Natural Science Foundation of China
Beijing Municipal Natural Science Foundation
Publisher
Springer Science and Business Media LLC
Subject
Hardware and Architecture,Information Systems,Theoretical Computer Science,Software