1. M.
Naumov
,
J.Kim,
D.Mudigere,
S.Sridharan,
X.Wang,
W.Zhao,
S.Yilmaz,
C.Kim,
H.Yuen,
M.Ozdal, “
Deep learning training in facebook data centers: Design of scale-up and scale-out systems,” arXiv:2003.09518 (2020).
2. Efficient large-scale language model training on GPU clusters using megatron-LM,2021
3. Software-hardware co-design for fast and scalable training of deep learning recommendation models,2022
4. D.
Patterson
,
J.Gonzalez,
Q.Le,
C.Liang,
L.-M.Munguia,
D.Rothchild,
D.So,
M.Texier, and
J.Dean, “
Carbon emissions and large neural network training,” arXiv:2104.10350 (2021).