Configurable range memory for effective data reuse on programmable accelerators

Author:

Lee Jongeun1,Seo Seongseok1,Paek Jongkyung2,Choi Kiyoung2

Affiliation:

1. UNIST, Ulsan, Korea

2. Seoul National University, Seoul, Korea

Abstract

While programmable accelerators such as application-specific processors and reconfigurable architectures can dramatically speed up compute-intensive kernels of an application, application performance can still be severely limited by the communication between processors. To minimize the communication overhead, a shared memory such as a scratchpad memory may be employed between the main processor and the accelerator coprocessor. However, this setup poses a significant challenge to the main processor, which now must manage data on the scratchpad explicitly, resulting in superfluous data copying due to the inflexibility of a scratchpad. In this article, we present an enhancement of a scratchpad, Configurable Range Memory (CRM), whose address range can be reprogrammed to minimize unnecessary data copying between processors and therefore promote data reuse on the accelerator, and also present a software management algorithm for the CRM. Our experimental results involving detailed simulation of full multimedia applications demonstrate that our CRM architecture can reduce the communication overhead quite effectively, reducing the kernel execution time by up to 28% and the application runtime by up to 12.8%, in addition to considerable system energy reduction, compared to the conventional architecture based on a scratchpad.

Funder

Ministry of Science, ICT and Future Planning

Ministry of Education, Science and Technology

National Research Foundation of Korea

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications

Reference32 articles.

1. ARM. 2003. ARM926EJ-S Technical Reference Manual. ARM. ARM. 2003. ARM926EJ-S Technical Reference Manual. ARM.

2. ARM. 2005. PrimeCell AXI Configurable Interconnect (PL300) Technical Reference Manual. ARM. ARM. 2005. PrimeCell AXI Configurable Interconnect (PL300) Technical Reference Manual. ARM.

3. Scratchpad memory

4. PSMalloc

5. A Coarse-Grained Array Accelerator for Software-Defined Radio Baseband Processing

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3