Prophet-Reference-Cited by-同舟云学术

Prophet

Published:2017-05-12 Issue:4 Volume:52 Page:17-32
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Chen Quan¹,Yang Hailong²,Guo Minyi¹,Kannan Ram Srivatsa³,Mars Jason³,Tang Lingjia³

Affiliation:

1. Shanghai Jiao Tong University, Shanghai, China

2. Beihang University, Beijing, China

3. University of Michigan - Ann Arbor, Ann Arbor, USA

Abstract

Guaranteeing Quality-of-Service (QoS) of latency-sensitive applications while improving server utilization through application co-location is important yet challenging in modern datacenters. The key challenge is that when applications are co-located on a server, performance interference due to resource contention can be detrimental to the application QoS. Although prior work has proposed techniques to identify "safe" co-locations where application QoS is satisfied by predicting the performance interference on multicores, no such prediction technique on accelerators such as GPUs. In this work, we present Prophet, an approach to precisely predict the performance degradation of latency-sensitive applications on accelerators due to application co-location. We analyzed the performance interference on accelerators through a real system investigation and found that unlike on multicores where the key contentious resources are shared caches and main memory bandwidth, the key contentious resources on accelerators are instead processing elements, accelerator memory bandwidth and PCIe bandwidth. Based on this observation, we designed interference models that enable the precise prediction for processing element, accelerator memory bandwidth and PCIe bandwidth contention on real hardware. By using a novel technique to forecast solo-run execution traces of the co-located applications using interference models, Prophet can accurately predict the performance degradation of latency-sensitive applications on non-preemptive accelerators. Using Prophet, we can identify "safe" co-locations on accelerators to improve utilization without violating the QoS target. Our evaluation shows that Prophet can predict the performance degradation with an average prediction error 5.47% on real systems. Meanwhile, based on the prediction, Prophet achieves accelerator utilization improvements of 49.9% on average while maintaining the QoS target of latency-sensitive applications.

Funder

National Natural Science Foundation of China

National Science Foundation

National Basic Research 973 Program of China

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3093336.3037700

Reference51 articles.

1. Nvidia Multi-Process Service. https://docs.nvidia.com/deploy/pdf/CUDA\_Multi\_Process\_Service\_Overview.pdf. Nvidia Multi-Process Service. https://docs.nvidia.com/deploy/pdf/CUDA\_Multi\_Process\_Service\_Overview.pdf.

2. Profiler User's Guide. http://docs.nvidia.com/cuda/profiler-users-guide. Profiler User's Guide. http://docs.nvidia.com/cuda/profiler-users-guide.

3. The case for GPGPU spatial multitasking

4. QoS-aware dynamic resource allocation for spatial-multitasking GPUs

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. ODIN: Overcoming Dynamic Interference in iNference Pipelines;Euro-Par 2023: Parallel Processing;2023

2. WSMeter;Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems;2018-03-19