Affiliation:
1. University of New South Wales, NSW, Australia
2. Macquarie University, NSW, Australia
Abstract
Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their performance and flexibility advantages. Typically, CGRAs incorporate many processing elements in the form of an array, which is suitable for implementing spatial redundancy, as used in the design of fault-tolerant systems. This article introduces a recovery time model for transient faults in CGRAs. The proposed fault-tolerant CGRAs are based on triple modular redundancy and coding techniques for error detection and correction. To evaluate the model, several kernels from space computing are mapped onto the suggested architecture. We demonstrate the tradeoff between recovery time, performance, and area. In addition, the average execution time of an application including recovery time is evaluated using area-based error-rate estimates in harsh radiation environments. The results show that task partitioning is important for bounding the recovery time of applications that have long execution times. It is also shown that error-correcting code (ECC) is of limited practical value for tasks with long execution times in high radiation environments, or when the degree of task partitioning is high.
Funder
Discovery
Australian Research Council's Linkage
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Software
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献