Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters

Author:

Awan Ammar Ahmad1,Chu Ching-Hsiang1,Subramoni Hari1,Panda Dhabaleswar K.1

Affiliation:

1. Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio

Publisher

ACM

Reference37 articles.

1. {n. d.}. KESCH: Cray CS-Storm System. http://www.cscs.ch/computers/kesch_escha/index.html. ({n. d.}). {n. d.}. KESCH: Cray CS-Storm System. http://www.cscs.ch/computers/kesch_escha/index.html. ({n. d.}).

2. 2015. CNTK. http://www.cntk.ai/. (2015). {Online; accessed April-2016}. 2015. CNTK. http://www.cntk.ai/. (2015). {Online; accessed April-2016}.

3. Martin Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S Corrado Andy Davis Jeffrey Dean Matthieu Devin etal {n. d.}. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems 2015. Software available from tensorflow. org ({n. d.}). Martin Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S Corrado Andy Davis Jeffrey Dean Matthieu Devin et al. {n. d.}. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems 2015. Software available from tensorflow. org ({n. d.}).

4. S-Caffe

5. Efficient Large Message Broadcast using NCCL and CUDA-Aware MPI for Deep Learning

Cited by 21 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Collective Communication Performance Evaluation for Distributed Deep Learning Training;Applied Sciences;2024-06-12

2. DistMind: Efficient Resource Disaggregation for Deep Learning Workloads;IEEE/ACM Transactions on Networking;2024-06

3. Real-time High-resolution X-Ray Computed Tomography;Proceedings of the 38th ACM International Conference on Supercomputing;2024-05-30

4. Towards Accelerating k-NN with MPI and Near-Memory Processing;2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW);2024-05-27

5. Memory Transfer Decomposition: Exploring Smart Data Movement Through Architecture-Aware Strategies;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3