Affiliation:
1. University of Cantabria, Santander, Spain
Abstract
Current High-Performance Computing (HPC) and data center networks rely on large-radix routers. Hamming graphs (Cartesian products of complete graphs) and dragonflies (two-level direct networks with nodes organized in groups) are some direct topologies proposed for such networks. The original definition of the dragonfly topology is very loose, with several degrees of freedom, such as the inter- and intragroup topology, the specific global connectivity, and the number of parallel links between groups (or trunking level). This work provides a comprehensive analysis of the topological properties of the dragonfly network, providing balancing conditions for network dimensioning, as well as introducing and classifying several alternatives for the global connectivity and trunking level. From a topological study of the network, it is noted that a Hamming graph can be seen as a canonical dragonfly topology with a high level of trunking. Based on this observation and by carefully selecting the global connectivity, the Dimension Order Routing (DOR) mechanism safely used in Hamming graphs is adapted to dragonfly networks with trunking. The resulting routing algorithms approximate the performance of minimal, nonminimal, and adaptive routings typically used in dragonflies but without requiring virtual channels to avoid packet deadlock, thus allowing for lower cost router implementations. This is obtained by properly selecting the link to route between groups based on a graph coloring of network routers. Evaluations show that the proposed mechanisms are competitive with traditional solutions when using the same number of virtual channels and enable for simpler implementations with lower cost. Finally, multilevel dragonflies are discussed, considering how the proposed mechanisms could be adapted to them.
Funder
European HiPEAC Network of Excellence and the JSA no. 2013-119
Spanish FPU
ERC-321253 (RoMoL)
Spanish Science and Technology Commission (CICYT) under contracts TIN2010-21291-C02-02 and TIN2013-46957-C2-2-P
IBM/BSC Technology Center for Supercomputing agreement
European Union FP7 under Agreements ICT-288777 (Mont-Blanc)
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
33 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Graph Constructions Derived from Interconnection Networks;Springer Proceedings in Mathematics & Statistics;2024
2. QVal: A Novel Routing Algorithm for Dragonfly Networks;2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys);2023-12-17
3. Analysing Mechanisms for Virtual Channel Management in Low-Diameter Networks;2023 IEEE 35th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD);2023-10-17
4. Efficient implementation of multi-level Dragonfly networks with Hamming graph for future optical networks;Journal of Optics;2023-05-30
5. Pancyclic And Hamiltonian Properties Of Dragonfly Networks;The Computer Journal;2023-05-14