Modular Switched Multiported SRAM-Based Memories

Author:

Abdelhadi Ameer M. S.1,Lemieux Guy G. F.1

Affiliation:

1. University of British Columbia, Vancouver, BC Canada

Abstract

Multiported RAMs are essential for high-performance parallel computation systems. VLIW and vector processors, CGRAs, DSPs, CMPs, and other processing systems often rely upon multiported memories for parallel access. Although memories with a large number of read and write ports are important, their high implementation cost means that they are used sparingly. As a result, FPGA vendors only provide dual-ported block RAMs (BRAMs) to handle the majority of usage patterns. Furthermore, recent attempts to create FPGA-based multiported memories suffer from low storage utilization. Whereas most approaches provide simple unidirectional ports with a fixed read or write, others propose true bidirectional ports where each port dynamically switches read and write. True RAM ports are useful for systems with transceivers and provide high RAM flexibility; however, this flexibility incurs high BRAM consumption. In this article, a novel, modular, and BRAM-based switched multiported RAM architecture is proposed. In addition to unidirectional ports with fixed read/write, this switched architecture allows a group of write ports to switch with another group of read ports dynamically, hence altering the number of active ports. The proposed switched-ports architecture is less flexible than a true-multiported RAM where each port is switched individually. Nevertheless, switched memories can dramatically reduce BRAM consumption compared to true ports for systems with alternating port requirements. Previous live-value-table (LVT) and XOR approaches are merged and optimized into a generalized and modular structure that we call an invalidation-based live-value-table (I-LVT). Like a regular LVT, the I-LVT determines the correct bank to read from, but it differs in how updates to the table are made; the LVT approach requires multiple write ports, often leading to an area-intensive register-based implementation, whereas the XOR approach suffers from excessive storage overhead since wider memories are required to accommodate the XOR-ed data. Two specific I-LVT implementations are proposed and evaluated: binary and thermometer coding. The I-LVT approach is especially suitable for deep memories because the table is implemented only in SRAM cells. The I-LVT method gives higher performance while occupying fewer BRAMs than earlier approaches: for several configurations, BRAM usage is reduced by greater than 44% and clock speed is improved by greater than 76%. The I-LVT can be used with fixed ports, true ports, or the proposed switched ports architectures. Formal proofs for the suggested methods, resources consumption analysis, usage guidelines, and analytic comparison to other methods are provided. A fully parameterized Verilog implementation is released as an open source library. The library has been extensively tested using Altera’s EDA tools.

Funder

NSERC

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference21 articles.

1. Modular multi-ported SRAM-based memories

2. Architecture of the Pentium microprocessor

3. Altera Corp. 2013. Stratix V Device Handbook. Available at https://www.altera.com. Altera Corp. 2013. Stratix V Device Handbook. Available at https://www.altera.com.

Cited by 5 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Design and Performance Analysis of Multiported Memory Module Using LVT and XOR Approaches on FPGA Platform;Evolutionary Computing and Mobile Sustainable Networks;2022

2. Revisiting Deep Learning Parallelism: Fine-Grained Inference Engine Utilizing Online Arithmetic;2019 International Conference on Field-Programmable Technology (ICFPT);2019-12

3. Accelerated Approximate Nearest Neighbors Search Through Hierarchical Product Quantization;2019 International Conference on Field-Programmable Technology (ICFPT);2019-12

4. Barrier Synchronization: Simplified, Generalized, and Solved Without Mutual Exclusion;2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW);2018-05

5. Efficient TCAM Design Based on Multipumping-Enabled Multiported SRAM on FPGA;IEEE Access;2018

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3