1. Yanjie Gao, Yu Liu, Hongyu Zhang, Zhengxian Li, Yonghao Zhu, Haoxiang Lin, Mao Yang, "Estimating GPU Memory Consumption of Deep Learning Models", 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1342-1352, Nov 2020.
2. Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dally, "EIE: Efficient Inference Engine on Compressed Deep Neural Network", IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), vol. 44, no. 3, pp. 243-254, June 2016
3. Song Han, Huizi Mao, William J. Dally, "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding", arXiv:1510.00149 [cs.CV], Feb 2016.
4. Nimit S. Sohoni, Christopher R. Aberger, Megan Leszczynski, Jian Zhang, Christopher R'e, "Low-Memory Neural Network Training: A Technical Report", arXiv:1904.10631 [cs.LG], Apr 2019.
5. Aashaka Shah, Chao-Yuan Wu, Jayashree Mohan, Vijay Chidambaram, Philipp Krähenbühl, "Memory Optimization for Deep Networks", arXiv:2010.14501 [cs.LG], Oct 2020.