You Only Traverse Twice: A YOTT Placement, Routing, and Timing Approach for CGRAs-Reference-Cited by-同舟云学术

You Only Traverse Twice: A YOTT Placement, Routing, and Timing Approach for CGRAs

Published:2021-10-31 Issue:5s Volume:20 Page:1-25
ISSN:1539-9087
Container-title:ACM Transactions on Embedded Computing Systems
language:en
Short-container-title:ACM Trans. Embed. Comput. Syst.

Author:

Canesche Michael¹,Carvalho Westerley¹,Reis Lucas¹,Oliveira Matheus¹,Magalhães Salles¹,Jamieson Peter²,Nacif Jaugusto M.¹,Ferreira Ricardo¹

Affiliation:

1. Universidade Federal de Viçosa, Brazil

2. Miami University, USA

Abstract

Coarse-grained reconfigurable architecture (CGRA) mapping involves three main steps: placement, routing, and timing. The mapping is an NP-complete problem, and a common strategy is to decouple this process into its independent steps. This work focuses on the placement step, and its aim is to propose a technique that is both reasonably fast and leads to high-performance solutions. Furthermore, a near-optimal placement simplifies the following routing and timing steps. Exact solutions cannot find placements in a reasonable execution time as input designs increase in size. Heuristic solutions include meta-heuristics, such as Simulated Annealing (SA) and fast and straightforward greedy heuristics based on graph traversal. However, as these approaches are probabilistic and have a large design space, it is not easy to provide both run-time efficiency and good solution quality. We propose a graph traversal heuristic that provides the best of both: high-quality placements similar to SA and the execution time of graph traversal approaches. Our placement introduces novel ideas based on “you only traverse twice” (YOTT) approach that performs a two-step graph traversal. The first traversal generates annotated data to guide the second step, which greedily performs the placement, node per node, aided by the annotated data and target architecture constraints. We introduce three new concepts to implement this technique: I/O and reconvergence annotation, degree matching, and look-ahead placement. Our analysis of this approach explores the placement execution time/quality trade-offs. We point out insights on how to analyze graph properties during dataflow mapping. Our results show that YOTT is 60.6

, 9.7

, and 2.3

faster than a high-quality SA, bounding box SA VPR, and multi-single traversal placements, respectively. Furthermore, YOTT reduces the average wire length and the maximal FIFO size (additional timing requirement on CGRAs) to avoid delay mismatches in fully pipelined architectures.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3477038

Reference47 articles.

1. A Deep Learning Framework to Predict Routability for FPGA Circuit Placement

2. A Design Exploration of Scalable Mesh-based Fully Pipelined Accelerators

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. GCN-RA: A graph convolutional network-based resource allocator for reconfigurable systems;Journal of Computational Science;2023-12