Efficient Fault-Tolerant Topology Reconfiguration Using a Maximum Flow Algorithm

Author:

Ren Yu1,Liu Leibo1,Yin Shouyi1,Han Jie2,Wei Shaojun1

Affiliation:

1. Tsinghua University, Beijing, China

2. University of Alberta, Edmonton, Canada

Abstract

With an increasing number of processing elements (PEs) integrated on a single chip, fault-tolerant techniques are critical to ensure the reliability of such complex systems. In current reconfigurable architectures, redundant PEs are utilized for fault tolerance. In the presence of faulty PEs, the physical topologies of various chips may be different, so the concept of virtual topology from network embedding problem has been used to alleviate the burden for the operating systems. With limited hardware resources, how to reconfigure a system into the most effective virtual topology such that the maximum repair rate can be reached presents a significant challenge. In this article, a new approach using a maximum flow (MF) algorithm is proposed for an efficient topology reconfiguration in reconfigurable architectures. In this approach, topology reconfiguration is converted into a network flow problem by constructing a directed graph; the solution is then found by using the MF algorithm. This approach optimizes the use of spare PEs with minimal impacts on area, throughput, and delay, and thus it significantly improves the repair rate of faulty PEs. In addition, it achieves a polynomial reconfiguration time. Experimental results show that compared to previous methods, the MF approach increases the probability to repair faulty PEs by up to 50% using the same redundant resources. Compared to a fault-free system, the throughput only decreases by less than 2.5% and latency increases by less than 4%. To consider various types of PEs in a practical application, a cost factor is introduced into the MF algorithm. An enhanced approach using a minimum-cost MF algorithm is further shown to be efficient in the fault-tolerant reconfiguration of heterogeneous reconfigurable architectures.

Funder

China National High Technologies Research Program

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Cited by 7 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3