Author:
Tipu Abdul Jabbar Saeed,Conbhuí Pádraig Ó,Howley Enda
Abstract
AbstractExSeisDat is designed using standard message passing interface (MPI) library for seismic data processing on high-performance super-computing clusters. These clusters are generally designed for efficient execution of complex tasks including large size IO. The IO performance degradation issues arise when multiple processes try accessing data from parallel networked storage. These complications are caused by restrictive protocols running by a parallel file system (PFS) controlling the disks and due to less advancement in storage hardware itself as well. This requires and leads to the tuning of specific configuration parameters to optimize the IO performance, commonly not considered by users focused on writing parallel application. Despite its consideration, the changes in configuration parameters are required from case to case. It adds up to further degradation in IO performance for a large SEG-Y format seismic data file scaling to petabytes. The SEG-Y IO and file sorting operations are the two of the main features of ExSeisDat. This research paper proposes technique to optimize these SEG-Y operations based on artificial neural networks (ANNs). The optimization involves auto-tuning of the related configuration parameters, using IO bandwidth prediction by the trained ANN models through machine learning (ML) process. Furthermore, we discuss the impact on prediction accuracy and statistical analysis of auto-tuning bandwidth results, by the variation in hidden layers nodes configuration of the ANNs. The results have shown the overall improvement in bandwidth performance up to 108.8% and 237.4% in the combined SEG-Y IO and file sorting operations test cases, respectively. Therefore, this paper has demonstrated the significant gain in SEG-Y seismic data bandwidth performance by auto-tuning the parameters settings on runtime by using an ML approach.
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Software
Reference35 articles.
1. Yilmaz Ö (2001) Seismic data analysis: processing, inversion, and interpretation of seismic data. Soc Expl Geophys,
2. Hagelund R, Levin SA (2017) Seg-y_r2. 0: Seg-y revision 2.0 data exchange format,
3. Pfister GF (2001) An introduction to the infiniband architecture. High performance mass storage and parallel I/O, 42:617–632,
4. Birrittella MS, Debbage M, Huggahalli R, Kunz J, Lovett T, Rimmer T, Underwood Keith D, Zak RC (2015) Intel® omni-path architecture: Enabling scalable, high performance fabrics. In 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pages 1–9. IEEE,
5. Gropp W, Lusk E, Doss N, Skjellum Anthony (1996) A high-performance, portable implementation of the mpi message passing interface standard. Parallel Comput 22(6):789–828