1. Exploiting linear structure within convolutional networks for efficient evaluation;denton;Proc Adv Neural Inf Process Syst,2014
2. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures;hu;arXiv 1607 03250,2016
3. Compressing deep neural networks with pruning, trained quantization and Huffman coding;han;arXiv 1510 00149 [cs],2015
4. Distilling the knowledge in a neural network;hinton;ArXiv 1503 02531,2015
5. Binarized neural networks;hubara;Proc Adv Neural Inf Process Syst,2016