Affiliation:
1. Department of Electrical Engineering, Columbia University, New York, New York 10027
Abstract
Jobs in modern parallel-computing frameworks, such as Hadoop and Spark, are subject to several constraints. In these frameworks, the data are typically distributed across a cluster of machines and is processed in multiple stages. Therefore, tasks that belong to the same stage (job) have a collective completion time that is determined by the slowest task in the collection. Furthermore, a task’s processing time is machine dependent, and each machine is capable of processing multiple tasks at a time subject to its capacity. In “Scheduling Parallel-Task Jobs Subject to Packing and Placement Constraints,” by Mehrnoosh Shafiee and Javad Ghaderi, multiple approximation algorithms with theoretical guarantees are provided to solve the problem under preemptive and nonpreemptive scenarios. The numerical results, using a real traffic trace, demonstrate that the algorithms yield significant gains over the prior approaches.
Publisher
Institute for Operations Research and the Management Sciences (INFORMS)
Subject
Management Science and Operations Research,Computer Science Applications