Affiliation:
1. University of Michigan, Ann Arbor, MI
Abstract
The increasing number of units in today's systems-on-chip and multicore processors has led to complex intra-chip communication solutions. Specifically, Networks-on-Chip (NoCs) have emerged as a favorable fabric to provide high bandwidth and low latency in connecting many units in a same chip. To achieve these goals, the NoC often includes complex components and advanced features, leading to the development of large and highly complex interconnect subsystems. One of the biggest challenges in these designs is to ensure the correct functionality of this communication infrastructure. To support this goal, an increasing fraction of the validation effort has shifted to post-silicon validation, because it permits exercising network activities that are too complex to be validated in pre-silicon. However, post-silicon validation is hindered by the lack of observability of the network's internal operations and thus, diagnosing functional errors during this phase is very difficult.
In this work, we propose a post-silicon validation platform that improves observability of network operations by taking periodic snapshots of the traffic traversing the network. Each node's local cache is configured to temporarily store the snapshot logs in a designated area reserved for post-silicon validation and relinquished after product release. Each snapshot log is analyzed locally by a software algorithm running on its corresponding core, in order to detect functional errors. Upon error detection, all snapshot logs are aggregated at a central location to extract additional debug data, including an overview of network traffic surrounding the error event, as well as a partial reconstruction of the routes followed by packets in flight at the time. In our experiments, we found that this approach allows us to detect several types of functional errors, as well as observe, on average, over 50% of the network's traffic and reconstruct at least half of each of their routes through the network.
Funder
Semiconductor Research Corporation
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Software
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献