A Reconfigurable Architecture for Binary Acceleration of Loops with Memory Accesses-Reference-Cited by-同舟云学术

A Reconfigurable Architecture for Binary Acceleration of Loops with Memory Accesses

Published:2015-01-23 Issue:4 Volume:7 Page:1-20
ISSN:1936-7406
Container-title:ACM Transactions on Reconfigurable Technology and Systems
language:en
Short-container-title:ACM Trans. Reconfigurable Technol. Syst.

Author:

Paulino Nuno¹,Ferreira João Canas¹,Cardoso João M. P.¹

Affiliation:

1. INESC TEC and Faculty of Engineering, University of Porto, Portugal

Abstract

This article presents a reconfigurable hardware/software architecture for binary acceleration of embedded applications. A Reconfigurable Processing Unit (RPU) is used as a coprocessor of the General Purpose Processor (GPP) to accelerate the execution of repetitive instruction sequences called Megablocks . A toolchain detects Megablocks from instruction traces and generates customized RPU implementations. The implementation of Megablocks with memory accesses uses a memory-sharing mechanism to support concurrent accesses to the entire address space of the GPP’s data memory. The scheduling of load/store operations and memory access handling have been optimized to minimize the latency introduced by memory accesses. The system is able to dynamically switch the execution between the GPP and the RPU when executing the original binaries of the input application. Our proof-of-concept prototype achieved geometric mean speedups of 1.60× and 1.18× for, respectively, a set of 37 benchmarks and a subset considering the 9 most complex benchmarks. With respect to a previous version of our approach, we achieved geometric mean speedup improvements from 1.22 to 1.53 for the 10 benchmarks previously used.

Funder

Fundação para a Ciência e a Tecnologia

European Regional Development Fund through the COMPETE Programme

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/2629468

Reference20 articles.

1. Conversion of control dependence to data dependence

2. Transparent reconfigurable acceleration for heterogeneous embedded applications

3. On Identifying Segments of Traces for Dynamic Compilation

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improving Performance and Energy Consumption in Embedded Systems via Binary Acceleration: A Survey;ACM Computing Surveys;2021-01-31

2. Generation of Customized Accelerators for Loop Pipelining of Binary Instruction Traces;IEEE Transactions on Very Large Scale Integration (VLSI) Systems;2017-01