Affiliation:
1. Chinese Academy of Sciences, Beijing, China
2. Peking University, Beijing, China
Abstract
High-order stencil computations, frequently found in many applications, pose severe challenges to emerging many-core platforms due to the complexities of hardware architectures as well as the sophisticated computing and data movement patterns. In this article, we tackle the challenges of high-order WENO computations in extreme-scale simulations of 3D gaseous waves on Sunway TaihuLight. We design efficient parallelization algorithms and present effective optimization techniques to fully exploit various parallelisms with reduced memory footprints, enhanced data reuse, and balanced computation load. Test results show the optimized code can scale to 9.98 million cores, solving 12.74 trillion unknowns with 23.12 Pflops double-precision performance.
Funder
National Natural Science Foundation of China
National Key R8D Plan of China
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献