Affiliation:
1. Massachusetts Institute of Technology
Abstract
As multicore architectures enter the mainstream, there is a pressing demand for high-level programming models that can effectively map to them. Stream programming offers an attractive way to expose coarse-grained parallelism, as streaming applications (image, video, DSP, etc.) are naturally represented by independent filters that communicate over explicit data channels.In this paper, we demonstrate an end-to-end stream compiler that attains robust multicore performance in the face of varying application characteristics. As benchmarks exhibit different amounts of task, data, and pipeline parallelism, we exploit all types of parallelism in a unified manner in order to achieve this generality. Our compiler, which maps from the StreamIt language to the 16-core Raw architecture, attains a 11.2x mean speedup over a single-core baseline, and a 1.84x speedup over our previous work.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Reference41 articles.
1. Raza Microelectronics Inc. http://www.razamicroelectronics.com/products/xlr.htm.]] Raza Microelectronics Inc. http://www.razamicroelectronics.com/products/xlr.htm.]]
2. StreamIt Language Specification. http://cag.lcs.mit.edu/streamit/papers/streamit-lang-spec.pdf.]] StreamIt Language Specification. http://cag.lcs.mit.edu/streamit/papers/streamit-lang-spec.pdf.]]
3. Optimizing stream programs using linear state space analysis
4. Xbox 360 System Architecture
5. Partitioning and pipelining for performance-constrained hardware/software systems
Cited by
19 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. DISCO: Distributed Inference with Sparse Communications;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03
2. Piper: Pipelining OpenMP Offloading Execution Through Compiler Optimization For Performance;2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC);2022-11
3. A Pipeline Pattern Detection Technique in Polly;Workshop Proceedings of the 51st International Conference on Parallel Processing;2022-08-29
4. Hierarchical Scheduling of an SDF/L Graph onto Multiple Processors;ACM Transactions on Design Automation of Electronic Systems;2022-05-31
5. DistrEdge: Speeding up Convolutional Neural Network Inference on Distributed Edge Devices;2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS);2022-05