1. Adam: A method for stochastic optimization;kingma;Proceedings of the International Conference on Learning Represen-tations,2015
2. Training deep nets with sublinear memory cost;tianqi;ArXiv Preprint,2016
3. Dynamic tensor rematerialization;kirisame;International Conference on Learning Representations,2021
4. Distribution adaptive int8 quantization for training cnns;kang;Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence,2021
5. Mixed precision training;paulius;International Conference on Learning Representations,2018