Scalable Graph Convolutional Network Training on Distributed-Memory Systems-Reference-Cited by-同舟云学术

Scalable Graph Convolutional Network Training on Distributed-Memory Systems

Published:2022-12 Issue:4 Volume:16 Page:711-724
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Demirci Gunduz Vehbi¹,Haldar Aparajita²,Ferhatosmanoglu Hakan²

Affiliation:

1. Imagination Technologies, United Kingdom

2. University of Warwick, United Kingdom

Abstract

Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. The large data sizes of graphs and their vertex features make scalable training algorithms and distributed memory systems necessary. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges. We propose a highly parallel training algorithm that scales to large processor counts. In our solution, the large adjacency and vertex-feature matrices are partitioned among processors. We exploit the vertex-partitioning of the graph to use non-blocking point-to-point communication operations between processors for better scalability. To further minimize the parallelization overheads, we introduce a sparse matrix partitioning scheme based on a hypergraph partitioning model for full-batch training. We also propose a novel stochastic hypergraph model to encode the expected communication volume in mini-batch training. We show the merits of the hypergraph model, previously unexplored for GCN training, over the standard graph partitioning model which does not accurately encode the communication costs. Experiments performed on real-world graph datasets demonstrate that the proposed algorithms achieve considerable speedups over alternative solutions. The optimizations achieved on communication costs become even more pronounced at high scalability with many processors. The performance benefits are preserved in deeper GCNs having more layers as well as on billion-scale graphs.

Publisher

Association for Computing Machinery (ACM)

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3574245.3574256

Reference71 articles.

1. Efficient Large Message Broadcast using NCCL and CUDA-Aware MPI for Deep Learning

2. Communication-optimal parallel algorithm for strassen's matrix multiplication

3. Neil Band . 2020 . MemFlow: Memory-Aware Distributed Deep Learning . In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2883--2885 . Neil Band. 2020. MemFlow: Memory-Aware Distributed Deep Learning. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2883--2885.

4. Zhenkun Cai , Xiao Yan , Yidi Wu , Kaihao Ma , James Cheng , and Fan Yu . 2021 . DGCL: An efficient communication library for distributed GNN training . In Proceedings of the Sixteenth European Conference on Computer Systems. 130--144 . Zhenkun Cai, Xiao Yan, Yidi Wu, Kaihao Ma, James Cheng, and Fan Yu. 2021. DGCL: An efficient communication library for distributed GNN training. In Proceedings of the Sixteenth European Conference on Computer Systems. 130--144.

5. Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Text-Rich Graph Neural Networks With Subjective-Objective Semantic Modeling;IEEE Transactions on Knowledge and Data Engineering;2024-09

2. Sylvie: 3D-Adaptive and Universal System for Large-Scale Graph Neural Network Training;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13

3. Distributed Graph Neural Network Training: A Survey;ACM Computing Surveys;2024-04-10

4. Fight Fire with Fire: Towards Robust Graph Neural Networks on Dynamic Graphs via Actively Defense;Proceedings of the VLDB Endowment;2024-04

5. Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation;Proceedings of the 32nd ACM International Conference on Information and Knowledge Management;2023-10-21