MILC Code Performance on High End CPU and GPU Supercomputer Clusters-Reference-Cited by-同舟云学术

MILC Code Performance on High End CPU and GPU Supercomputer Clusters

Published:2018 Issue: Volume:175 Page:02009
ISSN:2100-014X
Container-title:EPJ Web of Conferences
language:
Short-container-title:EPJ Web Conf.

Author:

DeTar Carleton,Gottlieb Steven,Li Ruizi,Toussaint Doug

Abstract

With recent developments in parallel supercomputing architecture, many core, multi-core, and GPU processors are now commonplace, resulting in more levels of parallelism, memory hierarchy, and programming complexity. It has been necessary to adapt the MILC code to these new processors starting with NVIDIA GPUs, and more recently, the Intel Xeon Phi processors. We report on our efforts to port and optimize our code for the Intel Knights Landing architecture. We consider performance of the MILC code with MPI and OpenMP, and optimizations with QOPQDP and QPhiX. For the latter approach, we concentrate on the staggered conjugate gradient and gauge force. We also consider performance on recent NVIDIA GPUs using the QUDA library.

Publisher

EDP Sciences

Link

https://www.epj-conferences.org/10.1051/epjconf/201817502009/pdf

Reference7 articles.

1. Jeffers J., et. al., Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition, Morgan Kaufmann pp. 581–597 (2016)

2. Benchmarking MILC code with OpenMP and MPI

3. Babich R., et. al., Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics, SC’10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 1710.01000

4. Boyle P., et. al., Grid: A next generation data parallel C++ QCD library, 1512.03487

5. Stampede2 - Texas Advanced Computing Center, https://www.tacc.utexas.edu/systems/stampede2

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Portable CPU implementation of Wilson, Brillouin and Susskind fermions in lattice QCD;Computer Physics Communications;2023-01

2. Efficient Execution of OpenMP on GPUs;2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO);2022-04-02