Affiliation:
1. University of Michigan, Ann Arbor, MI
Abstract
Clustered architectures are a solution to the bottleneck of centralized register files in superscalar and VLIW processors. The main challenge associated with clustered architectures is compiler support to effectively partition operations across the available resources on each cluster. In this work, we present a novel technique for clustering operations based on graph partitioning methods. Our approach incorporates new methods of assigning weights to nodes and edges within the dataflow graph to guide the partitioner. Nodes are assigned weights to reflect their resource usage within a cluster, while a slack distribution method intelligently assigns weights to edges to reflect the cost of inserting moves across clusters. A multilevel graph partitioning algorithm, which globally divides a dataflow graph into multiple parts in a hierarchical manner, uses these weights to efficiently generate estimates for the quality of partitions. We found that our algorithm was able to achieve an average of 20% improvement in DSP kernels and 5% improvement in SPECint2000 for a four-cluster architecture.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Compiling for VLIW DSPs;Handbook of Signal Processing Systems;2018-10-14
2. A constraint programming approach for integrated spatial and temporal scheduling for clustered architectures;ACM Transactions on Embedded Computing Systems;2013-08
3. Low cost control flow protection using abstract control signatures;Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems;2013-06-20
4. Low cost control flow protection using abstract control signatures;ACM SIGPLAN Notices;2013-05-23
5. Compiling for VLIW DSPs;Handbook of Signal Processing Systems;2013