Ant Mill: an adversarial traffic pattern for low-diameter direct networks
-
Published:2024-05-10
Issue:12
Volume:80
Page:18062-18080
-
ISSN:0920-8542
-
Container-title:The Journal of Supercomputing
-
language:en
-
Short-container-title:J Supercomput
Author:
Camarero Cristóbal,Martínez Carmen,Beivide Ramón
Abstract
AbstractSince today’s HPC and data center systems can comprise hundreds of thousands of servers and beyond, it is crucial to equip them with a network that provides high performance. New topologies proposed to achieve such performance need to be evaluated under different traffic conditions, aiming to closely replicate real-world scenarios. While most optimizations should be guided by common traffic patterns, it is essential to ensure that no pathological traffic pattern can compromise the entire system. Determining synthetic adversarial traffic patterns for a network typically relies on a thorough understanding of its topology and routing. In this paper, we address the problem of identifying a generic adversarial traffic pattern for low-diameter direct interconnection networks. We first focus on Random Regular Graphs (RRGs), which represent a typical case for these networks. Moreover, RRGs have been proposed as topologies for interconnection networks due to their superior scalability and expandability, among other advantages. We introduce Ant Mill, an adversarial traffic pattern for RRGs when using routes of minimal length. Secondly, we demonstrate that the Ant Mill traffic pattern is also adversarial in other low-diameter direct interconnection networks such as Slimfly, Dragonfly, and Projective networks. Ant Mill is thoroughly motivated and evaluated, enabling future studies of low-diameter direct interconnection networks to leverage its findings.
Funder
Universidad de Cantabria
Publisher
Springer Science and Business Media LLC
Reference41 articles.
1. Ahn JH, Binkert N, Al D, McLaren M, Schreiber RS (2009) HyperX: topology, routing, and packaging of efficient large-scale networks. In: Proceedings of the conference on High Performance Computing Networking, Storage and Analysis, SC'09, New York, NY, USA, ACM. pp 1–11 2. Ajima Y, Kawashima T, Okamoto T, Shida N, Hirai K, Shimizu T, Hiramoto S, Ikeda Y, Yoshikawa T, Uchida K, Inoue T (2018) The Tofu interconnect D. In: 2018 IEEE International Conference on Cluster Computing (CLUSTER), pp 646–654 3. ALzaid Z, Bhowmik S, Yuan X (2021) Multi-path routing in the Jellyfish network. In: 2021 IEEE international parallel and distributed processing symposium workshops (IPDPSW), pp 832–841 4. Atchley S, Zimmer C, Lange JR, Bernholdt DE, Melesse Vergara VG, Beck T, Brim MJ, Budiardja R, Chandrasekaran S, Eisenbach M, Evans T, Ezell M, Frontiere N, Georgiadou A, Glenski J, Grete P, Hamilton S, Holmen J, Huebl A, Jacobson D, Joubert W, McMahon K, Merzari E, Moore SG, Myers A, Nichols S, Oral S, Papatheodore T, Perez D, Rogers DM, Schneider E, Vay J-L, Yeung PK (2023) Frontier: exploring exascale the system architecture of the first exascale supercomputer. In: SC23: International Conference for High Performance Computing, Networking, Storage and Analysis, pp 1–16 5. Besta M, Hoefler T (2014) Slim Fly: a cost effective low-diameter network topology. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC'14, Piscataway, NJ, USA. IEEE Press. pp 348–359
|
|