1. Ron Banner Yury Nahshan and Daniel Soudry. 2019. Post training 4-bit quantization of convolutional networks for rapid-deployment. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. 7950–7958. https://proceedings.neurips.cc/paper/2019/hash/c0a62e133894cdce435bcb4a5df1db2d-Abstract.html
2. Ali Ehteshami Bejnordi and Ralf Krestel. 2020. Dynamic channel and layer gating in convolutional neural networks. In KI 2020: Advances in Artificial Intelligence. Lecture Notes in Computer Science Vol. 12325. Springer 33–45. 10.1007/978-3-030-58285-2_3
3. Davis W. Blalock Jose Javier Gonzalez Ortiz Jonathan Frankle and John V. Guttag. 2020. What is the state of neural network pruning? In Proceedings of the Conference on Machine Learning and Systems (MLSys’20).
4. Tolga Bolukbasi Joseph Wang Ofer Dekel and Venkatesh Saligrama. 2017. Adaptive neural networks for efficient inference. In Proceedings of the 34th International Conference on Machine Learning. 527–536. http://proceedings.mlr.press/v70/bolukbasi17a.html
5. Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. 2020. Once-for-all: Train one network and specialize it for efficient deployment. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20). https://openreview.net/forum?id=HylxE1HKwS