DynaHB: A Communication-Avoiding Asynchronous Distributed Framework with Hybrid Batches for Dynamic GNN Training-Reference-Cited by-同舟云学术

DynaHB: A Communication-Avoiding Asynchronous Distributed Framework with Hybrid Batches for Dynamic GNN Training

Published:2024-07 Issue:11 Volume:17 Page:3388-3401
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Song Zhen¹,Gu Yu¹,Sun Qing¹,Li Tianyi²,Zhang Yanfeng¹,Li Yushuai²,Jensen Christian S.²,Yu Ge¹

Affiliation:

1. Northeastern Univ., China

2. Aalborg Univ., Denmark

Abstract

Dynamic Graph Neural Networks (DGNNs) have demonstrated exceptional performance at dynamic-graph analysis tasks. However, the costs exceed those incurred by other learning tasks, to the point where deployment on large-scale dynamic graphs is infeasible. Existing distributed frameworks that facilitate DGNN training are in their early stages and experience challenges such as communication bottlenecks, imbalanced workloads, and GPU memory overflow. We introduce DynaHB, a distributed framework for DGNN training using so-called Hybrid Batches. DynaHB reduces communication by means of vertex caching, and it ensures even data and workload distribution by means of load-aware vertex partitioning. DyanHB also features a novel hybrid-batch training mode that combines vertex-batch and snapshot-batch techniques, thereby reducing training time and GPU memory usage. Next, to further enhance the hybrid batch based approach, DynaHB integrates a reinforcement learning-based batch adjuster and a pipelined batch generator with a batch reservoir to reduce the cost of generating hybrid batches. Extensive experiments show that DynaHB is capable of up to a 93× and an average of 8.06× speedups over the state-of-the-art training framework.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.14778/3681954.3682008

Reference36 articles.

1. Complexity of Finding Embeddings in a k-Tree

2. The Minimum Degree Heuristic and the Minimal Triangulation Process

3. Hans L Bodlaender et al. 1992. A tourist guide through treewidth. (1992).

4. The exact distance to destination in undirected world

5. P2H: Efficient Distance Querying on Road Networks by Projected Vertex Separators