Author:
Paznikov Alexey A.,Burachenko Alexander V.,Abuelsoud Mohamed M.
Abstract
In parallel programming for distributed-memory systems in MPI standard, remote memory access model (also known as one-sided communications, MPI RMA or RMA) is used along with the messagepassing. This model in many cases leverages the performance and simplifies parallel programming. Here arises the problem of synchronization of multiple parallel processes accessing shared (concurrent, distributed) data structures. In shared-memory machines (such as SMP/NUMA systems), non-blocking (lock-free, wait-free, obstruction-free) synchronization is widely used to solve the similar problem. The main advantage of nonblocking synchronization is that delays in execution of one process (thread) do not suspend execution of other threads. This avoids deadlocks, priority inversions, etc. We suppose this approach could also be effective in designing distributed data structures (in the MPI RMA model particularly). In this article, we discuss the idea of building non-blocking distributed data structures in MPI RMA model on the example of a queue, describe the designed algorithms of basic operations, investigates the efficiency of data structures, and provides an experimental comparison with lock-based counterparts.
Reference18 articles.
1. High Performance RDMA-Based MPI Implementation over InfiniBand
2. Remote Memory Access Programming in MPI-3
3. Tipparaju V., Nieplocha J., Panda D., Fast collective operations using shared and remote memory access protocols on clusters, Proc. Int. Parallel and Distributed Processing Symposium (IPDPS), 10 (2003)
4. Enabling highly scalable remote memory access programming with MPI-3 one sided
5. Herlihy M., Shavit N., The art of multiprocessor programming (Burlington, Morgan Kaufmann, 2011)