Affiliation:
1. Boston University, Boston, MA
Abstract
FPGA-centric clouds and clusters provide direct and programmable interconnects with obvious benefits for communication latency and bandwidth. One rarely studied aspect of DPI is that they facilitate application-aware routing: if communication patterns are static and known a priori, as is usually the case, then judicious routing can reduce congestion, latency, and the hardware required. In this study we explore applying the method of offline/static routing to collective operations, in particular, multicast and reduction. An entirely new communication infrastructure is proposed and implemented, including switch design and routing algorithm. A substantial improvement in performance is obtained, especially for multicast. We believe that this is one of the few general offline/static routing solutions for real HPC clusters, and FPGA-centric clusters in particular.
Publisher
Association for Computing Machinery (ACM)
Reference18 articles.
1. Architectural requirements of parallel scientific applications with explicit communication
2. Optimization of MPI collective operations on the IBM Blue Gene/Q supercomputer
3. Mellanox "Mellanox Introduces Programmable Network Adapter Product Line with Application Acceleration Engine " http://ir.mellanox.com/releasedetail.cfm?ReleaseID=883814 accessed 11/9/2015 2015. Mellanox "Mellanox Introduces Programmable Network Adapter Product Line with Application Acceleration Engine " http://ir.mellanox.com/releasedetail.cfm?ReleaseID=883814 accessed 11/9/2015 2015.
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献