MPress: Democratizing Billion-Scale Model Training on Multi-GPU Servers via Memory-Saving Inter-Operator Parallelism-Reference-Cited by-同舟云学术

MPress: Democratizing Billion-Scale Model Training on Multi-GPU Servers via Memory-Saving Inter-Operator Parallelism

Published:2023-02 Issue: Volume: Page:
ISSN:
Container-title:2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)
language:
Short-container-title:

Author:

Zhou Quan¹,Wang Haiquan¹,Yu Xiaoyan¹,Li Cheng¹,Bai Youhui¹,Yan Feng²,Xu Yinlong¹

Affiliation:

1. University of Science and Technology of China

2. University of Houston

Funder

National Natural Science Foundation of China

National Science Foundation

Publisher

IEEE

Link

Reference64 articles.

1. The Code Repo for PipeDream: Pipeline Parallelism for DNN Training,0

2. Efficient Transformers: A Survey;tay,2022

3. Stanford Question Answering Dataset v1.1,0

4. Delta: Dynamically optimizing gpu memory beyond tensor recomputation;tang,2022

5. Wikipedia Dataset,0

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Mobile Foundation Model as Firmware;Proceedings of the 30th Annual International Conference on Mobile Computing and Networking;2024-05-29

3. AdaPipe: Optimizing Pipeline Parallelism with Adaptive Recomputation and Partitioning;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2024-04-27

4. Liger: Interleaving Intra- and Inter-Operator Parallelism for Distributed Large Model Inference;Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming;2024-02-20