Affiliation:
1. Department of Computer Science and Engineering, The Pennsylvania State University, USA
Abstract
We consider the problem of mapping irregular applications to multiprocessor architectures whose interconnect topologies affect the latencies of data movement across processor nodes. The starting point for solutions to this problem concerns suitable weighted graph representations of an irregular application and a processor topology. Prior results for this problem have demonstrated that graph partitioning approaches can provide high-quality solutions. Additionally, when coordinate information is available for the weighted graph of the application, the geometric mapping schemes can also provide high-quality solutions. We develop and present a scheme that we call ‘embedded sectioning’ that directly computes a locality enhancing embedding of the weighted graph representation which is then mapped to the processor topology using recursive coordinate bisection. Our scheme is specifically directed at gaining high-quality mappings for highly irregular applications where the amount of communication can vary greatly. We evaluate the quality of mappings produced by embedded sectioning for mesh-based processor topologies using well-accepted measures including congestion, dilation and their product, referred to as the communication volume. For a test suite of unit-weight graphs mapped to a 32 × 32 mesh of processors, our method improves congestion by 26%, dilation by 52% and communication volume by 64% relative to the best values of these measures from nine other schemes. Additionally, we observe that these improvements increase with an increase in the skewness of communication in applications. For a test suite with a skewness of two the corresponding improvements for congestion, dilation and communication volume are 72%, 52% and 87%, respectively.
Subject
Hardware and Architecture,Theoretical Computer Science,Software
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献