1. Benoit, J., Skirmantas, K., Bo, C., Menglong, Z., Matthew, T., Andrew, H., Hartwig, A., & Dmitry, K. (2017). Quantization and training of neural networks for efficient integer-arithmetic-only inference. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2704–2713).
2. Bram-Ernst, V., Nathan, L., Stefan, C., Peter, D., Ioannis, P., Arindam, M., & Diederik, V. (2019). FQ-conv: Fully quantized convolution for efficient and accurate inference. ArXiv: abs/1912.09356
3. Chen, Z., Chen, Z., Lin, J., Liu, S., & Li, W. (2020). Deep neural network acceleration based on low-rank approximated channel pruning. IEEE Transactions on Circuits and Systems I: Regular Papers, 67(4), 1232–1244.
4. Choi, J., Kong, B. Y., & Park, I.-C. (2020). Retrain-less weight quantization for multiplier-less convolutional neural networks. IEEE Transactions on Circuits and Systems I: Regular Papers, 67(3), 972–982.
5. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Binarized neural networks: Training deep neural networks with weights and activations constrained to + 1 or − 1. arXiv preprint arXiv:1602.02830.