Affiliation:
1. University of North Carolina at Chapel Hill
2. University of Illinois, Urbana-Champaign
Abstract
Data center workloads are composed of multiresource jobs requiring a variety of computational resources including CPU cores, memory, disk space, and hardware accelerators. Modern servers can run multiple jobs in parallel, but a set of jobs can only run in parallel if the server has sufficient resources to satisfy the demands of each job. It is generally hard to find sets of jobs that perfectly utilize all server resources, and choosing the wrong set of jobs can lead to low resource utilization. This raises the question of how to allocate resources across a stream of arriving multiresource jobs to minimize the mean response time across jobs - the mean time from when a job arrives to the system until it is complete.
Publisher
Association for Computing Machinery (ACM)