Merge-Reference-Cited by-同舟云学术

Merge

Published:2008-03-25 Issue:2 Volume:42 Page:287-296
ISSN:0163-5980
Container-title:ACM SIGOPS Operating Systems Review
language:en
Short-container-title:SIGOPS Oper. Syst. Rev.

Author:

Linderman Michael D.¹,Collins Jamison D.²,Wang Hong²,Meng Teresa H.¹

Affiliation:

1. Stanford University, Stanford, CA

2. Intel Corporation, Santa Clara, CA

Abstract

In this paper we propose the Merge framework, a general purpose programming model for heterogeneous multi-core systems. The Merge framework replaces current ad hoc approaches to parallel programming on heterogeneous platforms with a rigorous, library-based methodology that can automatically distribute computation across heterogeneous cores to achieve increased energy and performance efficiency. The Merge framework provides (1) a predicate dispatch-based library system for managing and invoking function variants for multiple architectures; (2) a high-level, library-oriented parallel language based on map-reduce; and (3) a compiler and runtime which implement the map-reduce language pattern by dynamically selecting the best available function implementations for a given input and machine configuration. Using a generic sequencer architecture interface for heterogeneous accelerators, the Merge framework can integrate function variants for specialized accelerators, offering the potential for to-the-metal performance for a wide range of heterogeneous architectures, all transparent to the user. The Merge framework has been prototyped on a heterogeneous platform consisting of an Intel Core 2 Duo CPU and an 8-core 32-thread Intel Graphics and Media Accelerator X3000, and a homogeneous 32-way Unisys SMP system with Intel Xeon processors. We implemented a set of benchmarks using the Merge framework and enhanced the library with X3000 specific implementations, achieving speedups of 3.6x -- 8.5x using the X3000 and 5.2x -- 22x using the 32-way system relative to the straight C reference implementation on a single IA32 core.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/1353535.1346318

Reference32 articles.

1. Mitigating Amdahls Law through EPI Throttling

2. CVC Lite: A New Implementation of the Cooperating Validity Checker

3. Brook for GPUs

4. A metaobject protocol for C++

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Analysis on Heterogeneous Computing;Journal of Physics: Conference Series;2021-09-01

2. Design of self‐adaptable data parallel applications on multicore clusters automatically optimized for performance and energy through load distribution;Concurrency and Computation: Practice and Experience;2018-08-30

3. An Efficient Programming Skeleton for Clusters of Multi-Core Processors;International Journal of Parallel Programming;2017-09-18

4. Understanding GPU Power;ACM Computing Surveys;2016-12-13

5. LondonTube;Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems;2016-05-07