Publisher
Springer International Publishing
Reference38 articles.
1. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI 2016, pp. 265–283. USENIX Association, USA (2016)
2. Amari, S.: Backpropagation and stochastic gradient descent method. Neurocomputing 5(4–5), 185–196 (1993)
3. Azimi, R., Tam, D.K., Soares, L., Stumm, M.: Enhancing operating system support for multicore processors by using hardware performance monitoring. ACM SIGOPS Oper. Syst. Rev. 43(2), 56–65 (2009)
4. Barrow-Williams, N., Fensch, C., Moore, S.: A communication characterisation of Splash-2 and Parsec. In: 2009 IEEE International Symposium on Workload Characterization (IISWC), pp. 86–97. IEEE (2009)
5. Ben-Nun, T., Hoefler, T.: Demystifying parallel and distributed deep learning: an in-depth concurrency analysis. ACM Comput. Surv. (CSUR) 52(4), 1–43 (2019)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Snoopie: A Multi-GPU Communication Profiler and Visualizer;Proceedings of the 38th ACM International Conference on Supercomputing;2024-05-30
2. GPU-Initiated Resource Allocation for Irregular Workloads;Proceedings of the 3rd International Workshop on Extreme Heterogeneity Solutions;2024-03-02
3. Monitoring Collective Communication Among GPUs;Euro-Par 2021: Parallel Processing Workshops;2022