Affiliation:
1. ETH Zürich, Zürich, Switzerland
Abstract
A common multi-core pattern consists of processors communicating through shared, multi-banked on-chip memory. Two approaches exist: Interleaved address mapping, which spreads consecutive data over all banks, and contiguous address mapping, which stores consecutive data on a single bank.
In this work, we compare both approaches on the Kalray MPPA-256 platform. For contiguous mapping, we propose an algorithm, based on graph colouring techniques, to automatically perform the assignment of data blocks to memory banks with the goal of minimising access collisions and delays. Experiments with representative, parallel real-world benchmarks show that 69% of the tested configurations, when optimised for contiguous mapping by our algorithm, run up to 86% faster on average than with interleaved mapping.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Software
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Enlarging the Time Budget for Neural Network Based Predictors for Access Interval Prediction;2024 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA);2024-02-01
2. Extension VM: Interleaved Data Layout in Vector Memory;ACM Transactions on Architecture and Code Optimization;2023-11-07
3. CABARRE: Request Response Arbitration for Shared Cache Management;ACM Transactions on Embedded Computing Systems;2023-09-09
4. Optimizing Memory Allocation for Multi-Subgraph Mapping on Spatial Accelerators;Proceedings of the 16th ACM International Conference on Systems and Storage;2023-06-05
5. MuGRA: A Scalable Multi-Grained Reconfigurable Accelerator Powered by Elastic Neural Network;IEEE Transactions on Circuits and Systems I: Regular Papers;2022-01