1. Gandiva: Introspective cluster scheduling for deep learning;Xiao,2018
2. Analysis of large-scale multi-tenant GPU clusters for DNN training workloads;Jeon,2019
3. Tiresias: A GPU cluster manager for distributed deep learning;Gu,2019
4. HiveD: Sharing a GPU cluster for deep learning with guarantees;Zhao,2020
5. AntMan: Dynamic scaling on GPU clusters for deep learning;Xiao,2020