Author:
Ma Ying-Jin,Zhang Tan,He Lian-Hua,Jin Zhong, , , ,
Abstract
The development of first principle methods can represent the summit of the sciences in the material computing and molecular modeling, and the corresponding first principle software packages are closely related with the accumulation of theories and algorithms in this field. In this paper, we reported our recent progress in refactoring the first principle package BSTATE. The key points in the reconstruction are lowering the doorsill, extending the scope of application, as well as adjusting package to the popular computer hardware. And as such, we updated the Makefile system to the new CMake system, in which the GUI can be used and many math libraries can be configured automatically; we added the support for the Libxc library, in which a large quantity of density functionals are included; we updated the interface for supporting GPU, in order to support the heterogeneous computing system. After refactoring, the Makefile system of BSTATE can supply both the Makefile and CMake system, and the Fourier transform libraries such as FFTW2, FFTW3, and Cufftw, the math libraries such as Intel MKL library, Openblas, and the density functional library such as Libxc, can be automatically or manually assigned. The integration of FFTW3 can slightly prompt the calculating efficiency in Intel’s many integrated core (MIC) architecture, and the integration of Cufftw can supply the initial support for the graphics processing unit (GPU) architecture, respectively. The usage of Libxc library makes the BSTATE package has the capacity to use hundreds density functionals, and the usages of various functionals were demonstrated by calculating the density of states of GaAs compound. Beyond the integration of various libraries, the parallel performance of BSTATE was also investigated. It can be found that the Fourier transformation and the solving for the eigenvalue equations are the major contributions. Using the tuning and analysis utilities (TAU) tool, we found that the tasks can be well distributed in modern HPC clusters. It implied that the refactoring didn’t affect the parallel efficiency of original BSTATE package. In a following benchmark test of graphene fragments, one can found that the refactored BSTATE package showed the best performance, its FFTW3 & Libxc version owns about 0–17% acceleration comparing to that of FFTW2 version.
Publisher
Acta Physica Sinica, Chinese Physical Society and Institute of Physics, Chinese Academy of Sciences
Subject
General Physics and Astronomy
Reference29 articles.
1. Frisch M J, Trucks G W, Schlegel H B, et al. 2016 Gaussian Inc. Wallingford CT
2. Kresse G, Furthmüller J 1996 Comp. Mater. Sci. 6 15
3. Liu W, Wang F, Li L 2003 J. Theor. Comput. Chem. 2 257
4. Li P, Liu X, Chen M, Lin P, Ren X, Lin L, Yang C, He L 2016 Comput. Mater. Sci. 112 503
5. Fang Z, Terakura K J 2002 Phys. Condens. Mat. 14 3001