Author:
DeTar Carleton,Gottlieb Steven,Li Ruizi,Toussaint Doug
Abstract
With recent developments in parallel supercomputing architecture, many core, multi-core, and GPU processors are now commonplace, resulting in more levels of parallelism, memory hierarchy, and programming complexity. It has been necessary to adapt the MILC code to these new processors starting with NVIDIA GPUs, and more recently, the Intel Xeon Phi processors. We report on our efforts to port and optimize our code for the Intel Knights Landing architecture. We consider performance of the MILC code with MPI and OpenMP, and optimizations with QOPQDP and QPhiX. For the latter approach, we concentrate on the staggered conjugate gradient and gauge force. We also consider performance on recent NVIDIA GPUs using the QUDA library.
Reference7 articles.
1. Jeffers J., et. al., Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition, Morgan Kaufmann pp. 581–597 (2016)
2. Benchmarking MILC code with OpenMP and MPI
3. Babich R., et. al., Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics, SC’10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 1710.01000
4. Boyle P., et. al., Grid: A next generation data parallel C++ QCD library, 1512.03487
5. Stampede2 - Texas Advanced Computing Center, https://www.tacc.utexas.edu/systems/stampede2
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Portable CPU implementation of Wilson, Brillouin and Susskind fermions in lattice QCD;Computer Physics Communications;2023-01
2. Efficient Execution of OpenMP on GPUs;2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO);2022-04-02