Copperhead-Reference-Cited by-同舟云学术

Copperhead

Published:2011-09-07 Issue:8 Volume:46 Page:47-56
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Catanzaro Bryan¹,Garland Michael²,Keutzer Kurt¹

Affiliation:

1. University of California, Berkeley, Berkeley, CA, USA

2. NVIDIA Corporation, Santa Clara, CA, USA

Abstract

Modern parallel microprocessors deliver high performance on applications that expose substantial fine-grained data parallelism. Although data parallelism is widely available in many computations, implementing data parallel algorithms in low-level languages is often an unnecessarily difficult task. The characteristics of parallel microprocessors and the limitations of current programming methodologies motivate our design of Copperhead, a high-level data parallel language embedded in Python. The Copperhead programmer describes parallel computations via composition of familiar data parallel primitives supporting both flat and nested data parallel computation on arrays of data. Copperhead programs are expressed in a subset of the widely used Python programming language and interoperate with standard Python modules, including libraries for numeric computation, data visualization, and analysis. In this paper, we discuss the language, compiler, and runtime features that enable Copperhead to efficiently execute data parallel code. We define the restricted subset of Python which Copperhead supports and introduce the program analysis techniques necessary for compiling Copperhead code into efficient low-level implementations. We also outline the runtime support by which Copperhead programs interoperate with standard Python modules. We demonstrate the effectiveness of our techniques with several examples targeting the CUDA platform for parallel programming on GPUs. Copperhead code is concise, on average requiring 3.6 times fewer lines of code than CUDA, and the compiler generates efficient code, yielding 45-100% of the performance of hand-crafted, well optimized CUDA code.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/2038037.1941562

Reference27 articles.

1. Implementing sparse matrix-vector multiplication on throughput-oriented processors

2. Programming parallel algorithms

3. Implementation of a Portable Nested Data-Parallel Language

4. Fast support vector machine training and classification on graphics processors

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. GPotion: An embedded DSL for GPU programming in Elixir;Proceedings of the XXVII Brazilian Symposium on Programming Languages;2023-09-25

2. Performance of the Vipera Framework for DSLs on Micro-Core Architectures;Euro-Par 2022: Parallel Processing Workshops;2023

3. Enabling pipeline parallelism in heterogeneous managed runtime environments via batch processing;Proceedings of the 18th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments;2022-02-25

4. A Comprehensive Exploration of Languages for Parallel Computing;ACM Computing Surveys;2022-01-18

5. Python programmers have GPUs too: automatic Python loop parallelization with staged dependence analysis;Proceedings of the 15th ACM SIGPLAN International Symposium on Dynamic Languages;2019-10-20