RLARA: A TSV-Aware Reinforcement Learning Assisted Fault-Tolerant Routing Algorithm for 3D Network-on-Chip-Reference-Cited by-同舟云学术

RLARA: A TSV-Aware Reinforcement Learning Assisted Fault-Tolerant Routing Algorithm for 3D Network-on-Chip

Published:2023-12-02 Issue:23 Volume:12 Page:4867
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Jiao Jiajia¹,Shen Ruirui¹,Chen Lujian¹,Liu Jin¹^ORCID,Han Dezhi¹^ORCID

Affiliation:

1. College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China

Abstract

A three-dimensional Network-on-Chip (3D NoC) equips modern multicore processors with good scalability, a small area, and high performance using vertical through-silicon vias (TSV). However, the failure rate of TSV, which is higher than that of horizontal links, causes unpredictable topology variations and requires adaptive routing algorithms to select the available paths dynamically. Most works have aimed at the congestion control for TSV partially 3D NoCs to bypass the TSV reliability issue, while others have focused on the fault tolerance in TSV fully connected 3D NoCs and ignored the performance degradation. In order to adequately improve reliability and performance in TSV fully connected 3D NoC architectures, we propose a TSV-aware Reinforcement Learning Assisted Routing Algorithm (RLARA) for fault-tolerant 3D NoCs. The proposed method can take advantage of both the high throughput of fully connected TSVs and the cost-effective fault tolerance of partially connected TSVs using periodically updated TSV-aware Q table of reinforcement learning. RLARA makes the distributed routing decision with the lowest TSV utilization to avoid the overheating of the TSVs and mitigate the reliability problem. Furthermore, the K-means clustering algorithm is further adopted to compress the routing table of RLARA by exploiting the routing information similarity. To alleviate the inherent deadlock issue of adaptive routing algorithms, the link Q-value from reinforcement learning is combined with the router status based in buffer utilization to predict the congestion and enable RLARA to perform best even under a high traffic load. The experimental results of the ablation study on simulator Garnet 2.0 verify the effectiveness of our proposed RLARA under different fault models, which can perform better than the latest 3D NoC routing algorithms, with up to a 9.04% lower average delay and 8.58% higher successful delivered rate.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/12/23/4867/pdf

Reference38 articles.

1. Three-dimensional integrated circuits;Topol;IBM J. Res. Dev.,2006

2. Liu, C., Zhang, L., Han, Y., and Li, X. (2011, January 25–28). Vertical interconnects squeezing in symmetric 3D mesh Network-on-Chip. Proceedings of the 16th Asia and South Pacific Design Automation Conference, Yokohama, Japan.

3. Networks-on-chip in a three-dimensional environment: A performance evaluation;Feero;IEEE Trans. Comput.,2008

4. Qualitative analysis of 3D routing algorithms in 3 × 3 × 3 mesh NoC topology under varying load in Bio-SoC;Syal;Int. J. E-Health Med. Commun.,2020

5. Analytical reliability analysis of 3D NoC under TSV failure;Khayambashi;ACM J. Emerg. Technol. Comput. Syst.,2015