1. Anderson, A., Vasudevan, A., Keane, C., and Gregg, D. (2020). High-performance low-memory lowering: Gemm-based algorithms for dnn convolution. In 2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pages 99–106.
2. Barrachina, S., Castelló, A., Dolz, M. F., Low, T. M., Martínez, H., Quintana-Ortí, E. S., Sridhar, U., and Tomás, A. E. (2023). Reformulating the direct convolution for high-performance deep learning inference on arm processors. Journal of Systems Architecture, 135:102806.
3. Chellapilla, K., Puri, S., and Simard, P. (2006). High Performance Convolutional Neural Networks for Document Processing. In Lorette, G., editor, Tenth International Workshop on Frontiers in Handwriting Recognition, La Baule (France). Université de Rennes 1, Suvisoft.
4. Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J. M., Tran, J., Catanzaro, B., and Shelhamer, E. (2014). cudnn: Efficient primitives for deep learning. ArXiv, abs/1410.0759.
5. Cho, M. and Brand, D. (2017). Mec: Memory-efficient convolution for deep neural network. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 815–824. JMLR.org.