1. Compression-aware training of deep networks;Alvarez,2017
2. S.J. Hanson, L.Y. Pratt, Comparing biases for minimal network construction with back-propagation, in: Advances in Neural Information Processing Systems, 1989, pp. 177–185.
3. Y. LeCun, J.S. Denker, S.A. Solla, Optimal brain damage, in: Advances in Neural Information Processing Systems, 1990, pp. 598–605.
4. V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, V. Lempitsky, Speeding-up convolutional neural networks using fine-tuned cp-decomposition, in: International Conference on Learning Representations, 2015.
5. T. Garipov, D. Podoprikhin, A. Novikov, D. Vetrov, Ultimate tensorization: compressing convolutional and FC layers alike, arXiv preprint arXiv:1611.03214.