Affiliation:
1. National Center for Computational Sciences Oak Ridge National Laboratory Oak Ridge Tennessee USA
Abstract
SummaryThe Oak Ridge Leadership Computing Facility (OLCF) has a long history of supporting and promoting GPU‐accelerated computing starting with the deployment of the Titan supercomputer in 2021 and continuing with the Summit supercomputer which has a theoretical peak performance of approximately 200 petaflops. Because the majority of Summit's computational power comes from its 27,972 GPUs, users must port their applications to one of the supported programming models in order to make efficient use of the system. To prepare the transition to Frontier, the OLCF's exascale supercomputer, users will need to adapt to an entirely new ecosystem which will include new hardware and software technologies. First, users will need to familiarize themselves with the AMD Radeon GPU architecture. Furthermore, users who have been previously relying on CUDA will need to transition to the Heterogeneous‐Computing Interface for Portability (HIP) or one of the other supported programming models (e.g., OpenMP, OpenACC). In this work, we describe our initial experiences and lessons learned in porting three applications or proxy apps currently running on Summit to the HPE/Cray ecosystem to leverage the compute power from AMD GPUs: minisweep, GenASiS, and Sparkler. Each one is representative of current production workloads utilized at the OLCF, different programming languages, and different programming models.
Funder
Oak Ridge National Laboratory