Affiliation:
1. Nordea Capital Markets, Copenhagen, Denmark
2. LexiFi
3. University of Copenhagen
4. University of Copenhagen, Copenhagen, Denmark
Abstract
Commodity many-core hardware is now mainstream, but parallel programming models are still lagging behind in efficiently utilizing the application parallelism. There are (at least) two principal reasons for this. First, real-world programs often take the form of a deeply nested composition of parallel operators, but mapping the available parallelism to the hardware requires a set of transformations that are tedious to do by hand and beyond the capability of the common user. Second, the best optimization strategy, such as what to parallelize and what to efficiently sequentialize, is often sensitive to the input dataset and therefore requires multiple code versions that are optimized differently, which also raises maintainability problems.
This article presents three array-based applications from the financial domain that are suitable for
gpgpu
execution. Common benchmark-design practice has been to provide the same code for the sequential and parallel versions that are optimized for only one class of datasets. In comparison, we document (1) all available parallelism via nested map-reduce functional combinators, in a simple Haskell implementation that closely resembles the original code structure, (2) the invariants and code transformations that govern the main trade-offs of a data-sensitive optimization space, and (3) report target
cpu
and multiversion
gpgpu
code together with an evaluation that demonstrates optimization trade-offs and other difficulties. We believe that this work provides useful insight into the language constructs and compiler infrastructure capable of expressing and optimizing such applications, and we report in-progress work in this direction.
Funder
Danish Council for Strategic Research
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Memory Optimizations in an Array Language;SC22: International Conference for High Performance Computing, Networking, Storage and Analysis;2022-11
2. Distributed parallel computing with Futhark: a functional language to generate distributed parallel code;Proceedings of the 8th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming;2022-06-13
3. Acceleration of lattice models for pricing portfolios of fixed-income derivatives;Proceedings of the 7th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming;2021-06-17
4. Towards size-dependent types for array programming;Proceedings of the 7th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming;2021-06-17
5. Bounds Checking on GPU;International Journal of Parallel Programming;2021-03-25