Abstract
AbstractWith the development of peta- and exascale size computational systems there is growing interest in running Big Data and Artificial Intelligence (AI) applications on them. Big Data and AI applications are implemented in Java, Scala, Python and other languages that are not widely used in High-Performance Computing (HPC) which is still dominated by C and Fortran. Moreover, they are based on dedicated environments such as Hadoop or Spark which are difficult to integrate with the traditional HPC management systems. We have developed the Parallel Computing in Java (PCJ) library, a tool for scalable high-performance computing and Big Data processing in Java. In this paper, we present the basic functionality of the PCJ library with examples of highly scalable applications running on the large resources. The performance results are presented for different classes of applications including traditional computational intensive (HPC) workloads (e.g. stencil), as well as communication-intensive algorithms such as Fast Fourier Transform (FFT). We present implementation details and performance results for Big Data type processing running on petascale size systems. The examples of large scale AI workloads parallelized using PCJ are presented.
Publisher
Springer Science and Business Media LLC
Subject
Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems
Reference48 articles.
1. Hadjidoukas P, Bartezzaghi A, Scheidegger F, Istrate R, Bekas C, Malossi A. torcpy: Supporting task parallelism in Python. SoftwareX. 2020;12:100517.
2. Nowicki M, Bała P. Parallel computations in Java with PCJ library. In: 2012 International Conference on High Performance Computing & Simulation (HPCS). IEEE; 2012. p. 381–387.
3. Almasi G. PGAS (Partitioned Global Address Space) Languages. In: Padua D, editor. Encyclopedia of Parallel Computing. Boston: Springer; 2011. p. 1539–45.
4. Challenge Awards HPC, Competition: Awards, . Awards: Class 2. 2014. . http://www.hpcchallenge.org/custom/index.html?lid=103&slid=272. Accessed 29 Jan 2021.
5. Nowicki M, Ryczkowska M, Górski Ł, Bała P. Big Data Analytics in Java with PCJ Library: Performance Comparison with Hadoop. In: International Conference on Parallel Processing and Applied Mathematics. Springer; 2017. p. 318–327.
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Will Artificial Intelligence Replace Knowledge Centers? Assessment of the Situation;Mimarlık Bilimleri ve Uygulamaları Dergisi (MBUD);2024-05-05
2. Analyzing C++ Stream Parallelism in Shared-Memory when Porting to Flink and Storm;2023 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW);2023-10-17
3. The Matrix Trilogy;Advances in Media, Entertainment, and the Arts;2023-06-16
4. Carpooling Solutions Using Machine Learning Tools;Advanced Research and Real-World Applications of Industry 5.0;2023-02-24
5. A simplified deformation forewarning method for longitudinal structural performance of existing shield tunnels based on Fast Fourier Transform;Tunnelling and Underground Space Technology;2023-01