Affiliation:
1. University of Houston, Clear Lake, USA
Abstract
The unprecedented scaling of embedded devices and its undesirable consequences leading to stochastic fault occurrences make reliability a critical design and optimization metric. In this chapter, in order to improve reliability of multi-core embedded systems, a task recomputation-based approach is presented. Given a task graph representation of the application, the proposed technique targets at the tasks whose failures cause more significant effect on overall system reliability. The results of the tasks with larger fault propagation scope are recomputed during the idle times of the available processors without incurring any performance or power overhead. The technique incorporates the fault propagation scope of each task and its degree of criticality into the scheduling algorithm and maximizes the usage of the processing elements. The experimental evaluation demonstrates the viability of the proposed approach and generates more efficient results under different latency constraints.