1. J. Albericio , A. Delmas , P. Judd , S. Sharify , G. O'Leary , R. Genov , and A. Moshovos , " Bit-pragmatic deep neural network computing," in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture , MICRO 2017 , Cambridge, MA, USA, October 14--18 , 2017 . ACM, 2017, pp. 382 -- 394 . J. Albericio, A. Delmas, P. Judd, S. Sharify, G. O'Leary, R. Genov, and A. Moshovos, "Bit-pragmatic deep neural network computing," in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2017, Cambridge, MA, USA, October 14--18, 2017. ACM, 2017, pp. 382--394.
2. J. Albericio , P. Judd , T. Hetherington , T. Aamodt , N. E. Jerger , and A. Moshovos , " Cnvlutin: Ineffectual-neuron-free deep neural network computing," in 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) , 2016 , pp. 1 -- 13 . J. Albericio, P. Judd, T. Hetherington, T. Aamodt, N. E. Jerger, and A. Moshovos, "Cnvlutin: Ineffectual-neuron-free deep neural network computing," in 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016, pp. 1--13.
3. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning," in Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS '14. New York, NY;Chen T.;USA: Association for Computing Machinery,2014
4. R. Cheong and R. Daniel , " transformers. zip: Compressing transformers with pruning and quantization," Technical report , Stanford University , 2019 . R. Cheong and R. Daniel, "transformers. zip: Compressing transformers with pruning and quantization," Technical report, Stanford University, 2019.
5. C. Deng , S. Liao , Y. Xie , K. K. Parhi , X. Qian , and B. Yuan , " PermDNN: Efficient compressed DNN architecture with permuted diagonal matrices," in Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-51 . IEEE Press , 2018 , p. 189--202. C. Deng, S. Liao, Y. Xie, K. K. Parhi, X. Qian, and B. Yuan, "PermDNN: Efficient compressed DNN architecture with permuted diagonal matrices," in Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-51. IEEE Press, 2018, p. 189--202.