FinPar-Reference-Cited by-同舟云学术

FinPar

Published:2016-06-27 Issue:2 Volume:13 Page:1-27
ISSN:1544-3566
Container-title:ACM Transactions on Architecture and Code Optimization
language:en
Short-container-title:ACM Trans. Archit. Code Optim.

Author:

Andreetta Christian¹,Bégot Vivien²,Berthold Jost³,Elsman Martin⁴,Henglein Fritz⁴,Henriksen Troels⁴,Nordfang Maj-Britt⁴,Oancea Cosmin E.⁴

Affiliation:

1. Nordea Capital Markets, Copenhagen, Denmark

2. LexiFi

3. University of Copenhagen

4. University of Copenhagen, Copenhagen, Denmark

Abstract

Commodity many-core hardware is now mainstream, but parallel programming models are still lagging behind in efficiently utilizing the application parallelism. There are (at least) two principal reasons for this. First, real-world programs often take the form of a deeply nested composition of parallel operators, but mapping the available parallelism to the hardware requires a set of transformations that are tedious to do by hand and beyond the capability of the common user. Second, the best optimization strategy, such as what to parallelize and what to efficiently sequentialize, is often sensitive to the input dataset and therefore requires multiple code versions that are optimized differently, which also raises maintainability problems. This article presents three array-based applications from the financial domain that are suitable for gpgpu execution. Common benchmark-design practice has been to provide the same code for the sequential and parallel versions that are optimized for only one class of datasets. In comparison, we document (1) all available parallelism via nested map-reduce functional combinators, in a simple Haskell implementation that closely resembles the original code structure, (2) the invariants and code transformations that govern the main trade-offs of a data-sensitive optimization space, and (3) report target cpu and multiversion gpgpu code together with an evaluation that demonstrates optimization trade-offs and other difficulties. We believe that this work provides useful insight into the language constructs and compiler infrastructure capable of expressing and optimizing such applications, and we report in-progress work in this direction.

Funder

Danish Council for Strategic Research

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/2898354

Reference69 articles.

1. Certified symbolic management of financial multi-party contracts

2. Lecture Notes in Computer Science;Barendsen Erik

3. Automatic C-to-CUDA Code Generation for Affine Programs

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Memory Optimizations in an Array Language;SC22: International Conference for High Performance Computing, Networking, Storage and Analysis;2022-11

2. Distributed parallel computing with Futhark: a functional language to generate distributed parallel code;Proceedings of the 8th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming;2022-06-13

3. Acceleration of lattice models for pricing portfolios of fixed-income derivatives;Proceedings of the 7th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming;2021-06-17

4. Towards size-dependent types for array programming;Proceedings of the 7th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming;2021-06-17

5. Bounds Checking on GPU;International Journal of Parallel Programming;2021-03-25