Memory-Constrained Vectorization and Scheduling of Dataflow Graphs for Hybrid CPU-GPU Platforms-Reference-Cited by-同舟云学术

Memory-Constrained Vectorization and Scheduling of Dataflow Graphs for Hybrid CPU-GPU Platforms

Published:2018-03-31 Issue:2 Volume:17 Page:1-25
ISSN:1539-9087
Container-title:ACM Transactions on Embedded Computing Systems
language:en
Short-container-title:ACM Trans. Embed. Comput. Syst.

Author:

Lin Shuoxin¹^ORCID,Wu Jiahao¹,Bhattacharyya Shuvra S.²

Affiliation:

1. University of Maryland, MD, USA

2. University of Maryland and Tampere University of Technology

Abstract

The increasing use of heterogeneous embedded systems with multi-core CPUs and Graphics Processing Units (GPUs) presents important challenges in effectively exploiting pipeline, task, and data-level parallelism to meet throughput requirements of digital signal processing applications. Moreover, in the presence of system-level memory constraints, hand optimization of code to satisfy these requirements is inefficient and error prone and can therefore, greatly slow down development time or result in highly underutilized processing resources. In this article, we present vectorization and scheduling methods to effectively exploit multiple forms of parallelism for throughput optimization on hybrid CPU-GPU platforms, while conforming to system-level memory constraints. The methods operate on synchronous dataflow representations, which are widely used in the design of embedded systems for signal and information processing. We show that our novel methods can significantly improve system throughput compared to previous vectorization and scheduling approaches under the same memory constraints. In addition, we present a practical case-study of applying our methods to significantly improve the throughput of an orthogonal frequency division multiplexing receiver system for wireless communications.

Funder

National Science Foundation

Laboratory for Telecommunication Sciences

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3157669

Reference29 articles.

1. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

2. S. S. Bhattacharyya E. Deprettere R. Leupers and J. Takala (Eds.). 2013. Handbook of Signal Processing Systems (second ed.). Springer. S. S. Bhattacharyya E. Deprettere R. Leupers and J. Takala (Eds.). 2013. Handbook of Signal Processing Systems (second ed.). Springer.

3. S. S. Bhattacharyya P. K. Murthy and E. A. Lee. 1996. Software Synthesis from Dataflow Graphs. Kluwer Academic. S. S. Bhattacharyya P. K. Murthy and E. A. Lee. 1996. Software Synthesis from Dataflow Graphs. Kluwer Academic.

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. XeroZerox: Analysis and Optimization of GPU Memory Management for High-Integrity Autonomous Systems;IEEE Access;2024

2. Research on vectorized engineering file management model;Applied Mathematics and Nonlinear Sciences;2023-06-30

3. A Framework for Fixed Priority Periodic Scheduling Synthesis from Synchronous Data-Flow Graphs;Lecture Notes in Computer Science;2022

4. Real-Time Neuron Detection and Neural Signal Extraction Platform for Miniature Calcium Imaging;Frontiers in Computational Neuroscience;2020-06-26

5. Real-Time Scheduling upon a Host-Centric Acceleration Architecture with Data Offloading;2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS);2020-04