Affiliation:
1. University of Maryland
Abstract
We present a framework for performance-, bandwidth-, and energy-efficient intercore communication in embedded MultiProcessor Systems-on-a-Chip (MPSoC). The methodology seamlessly integrates compiler, operating system, and hardware support to achieve a low-cost communication between synchronized producers and consumers. The technique is especially beneficial for data-streaming applications exploiting pipeline parallelism with computational phases mapped to separate cores. Code transformations utilizing a simple ISA support ensure that producer writes are propagated to consumers with a single interconnect transaction per cache block just prior to the producer exiting its synchronization region. Furthermore, in order to completely eliminate misses to shared data caused by interference with private data and also to minimize the cache energy, we integrate to the proposed framework a cache way partitioning policy based on a simple cache configurability support, which isolates the shared buffers from other cache traffic. This mechanism results in significant power savings since only a subset of the cache ways needs to be looked up for each cache access. The end result of the proposed framework is a single communication transaction per shared cache block between a producer and a consumer with no coherence misses on the consumer caches. Our experiments demonstrate significant reductions in interconnect traffic, cache misses, and energy for a set of multiprocessor benchmarks.
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications