Abstract
AbstractIn modern HPC systems with deep hierarchical architectures, large-scale applications often struggle to efficiently utilize the abundant cores due to the saturation of resources such as memory. Co-allocating multiple applications to share compute nodes can mitigate these issues and increase system throughput. However, co-allocation may harm the performance of individual applications due to resource contention. Past research suggests that topology-aware mappings can improve the performance of parallel applications that do not share resources. In this work, we implement application-oblivious, topology-aware process-to-core mappings via different core enumerations that support the co-allocation of parallel applications. We show that these mappings have a significant impact on the available memory bandwidth. We explore how these process-to-core mappings can affect the individual application duration as well as the makespan of job schedules when they are combined with co-allocation. Our main objective is to assess whether co-allocation with a topology-aware mapping can be a viable alternative to the exclusive node allocation policies that are currently common in HPC clusters.
Publisher
Springer Nature Switzerland
Reference5 articles.
1. de Blanche, A., Lundqvist, T.: Terrible twins: a simple scheme to avoid bad co-schedules. In: Proceedings of the 1st COSH Workshop, pp. 25–30 (2016)
2. Breslow, A.D., et al.: The case for colocation of high performance computing workloads. Concurr. Comput.: Pract. Exper. 232–251 (2016)
3. Frank, A., Süß, T., Brinkmann, A.: Effects and benefits of node sharing strategies in HPC batch systems. In: IEEE IPDPS, pp. 43–53 (2019)
4. von Kirchbach, K., Lehr, M., Hunold, S., Schulz, C., Träff, J.L.: Efficient process-to-node mapping algorithms for stencil computations. In: CLUSTER (2020)
5. Vardas, I., Hunold, S., Ajanohoun, J.I., Träff, J.L.: mpisee: MPI profiling for communication and communicator structure. In: IEEE IPDPSW, pp. 520–529 (2022)