YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs


Zhou Cyrus1ORCID,Hassman Zack1ORCID,Shah Dhirpal1ORCID,Richard Vaughn1ORCID,Li Yanjing1ORCID


1. University of Chicago, Chicago, USA



Reference115 articles.

1. [n. d.]. GNU Compiler Collection. https://gcc.gnu.org/ Accessed: 2023-08-28

2. Martín Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S. Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dandelion Mané Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Viégas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org

3. Ordering chaos: Memory-aware scheduling of irregularly wired neural networks for edge devices;Ahn Byung Hoon;Proceedings of Machine Learning and Systems,2020

4. Winograd convolution for deep neural networks: Efficient point selection;Alam Syed Asad;ACM Transactions on Embedded Computing Systems,2022

5. M. Alwani, H. Chen, M. Ferdman, and P. Milder. 2016. Fused-layer CNN accelerators. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1–12. https://doi.org/10.1109/MICRO.2016.7783725 10.1109/MICRO.2016.7783725








Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3