1. Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads
2. Ravichandra Addanki, Shaileshh Bojja Venkatakrishnan, Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh. 2019. Learning Generalizable Device Placement Algorithms for Distributed Machine Learning. In Advances in Neural Information Processing Systems (NeurIPS), Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). OpenReview.net, Vancouver, BC, Canada, 3983--3993.
3. Byung Hoon Ahn, Jinwon Lee, Jamie Menjay Lin, Hsin-Pai Cheng, Jilei Hou, and Hadi Esmaeilzadeh. 2020. Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices. In Proceedings of Machine Learning and Systems (MLSys), Inderjit S. Dhillon, Dimitris S. Papailiopoulos, and Vivienne Sze (Eds.). mlsys.org, Austin, TX, USA, 1--14.
4. Fused-layer CNN accelerators. In Proceedings of the 49th IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE Computer Society, Taipei;Alwani Manoj;Taiwan,2016
5. Arteries. 2022. Arteries IP Homepage. https://www.arteris.com.