Affiliation:
1. Tsinghua University, Beijing, China
2. Imperial College London, London, UK
Abstract
This article demonstrates an approach for combining general tuning techniques with the POWER8 hardware architecture through optimizing three representative stencil benchmarks. Two typical real-world applications, with kernels similar to those of the winning programs of the Gordon Bell Prize 2016 and 2017, are employed to illustrate algorithm modifications and a combination of hardware-oriented tuning strategies with the application algorithms. This work fills the gap between hardware capability and software performance of the POWER8 processor, and provides useful guidance for optimizing stencil-based scientific applications on POWER systems.
Funder
UK EPSRC
China Postdoctoral Science Foundation
National Natural Science Foundation of China
Tsinghua University Initiative Scientific Research Program
National Key Research 8 Development Plan of China
EU Horizon 2020 Research and Innovation Programme
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Optimizing Cloud Computing Resource Usage for Hemodynamic Simulation;2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS);2023-05
2. Hodgkin-Huxley-Based Neural Simulation with Networks Connecting to Near-Neighbor Neurons;2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP);2021-07
3. Low Precision Processing for High Order Stencil Computations;Lecture Notes in Computer Science;2019