Affiliation:
1. Computer Science and Engineering, Seoul National University, Seoul, Korea (the Republic of)
Abstract
This tutorial paper addresses a low power computer vision system as an example of a growing application domain of neural networks, exploring various technologies developed to enhance accuracy within the resource and performance constraints imposed by the hardware platform. Focused on a given hardware platform and network model, software optimization techniques, including pruning, quantization, low-rank approximation, and parallelization, aim to satisfy resource and performance constraints while minimizing accuracy loss. Due to the interdependence of model compression approaches, their systematic application is crucial, as evidenced by winning solutions in the Lower Power Image Recognition Challenge (LPIRC) of 2017 and 2018. Recognizing the typical heterogeneity of processing elements in contemporary hardware platforms, the effective utilization through parallelizing neural networks emerges as increasingly vital for performance enhancement. The paper advocates for a more impactful strategy—designing a network architecture tailored to a specific hardware platform. For detailed information on each technique, the paper provides corresponding references.
Publisher
Association for Computing Machinery (ACM)
Reference38 articles.
1. Enabling mixed-precision quantized neural networks in extreme-edge devices
2. Krishna Teja Chitty-Venkata Sparsh Mittal Murali Emani Venkatram Vishwanath and Arun K. Somani. 2023. A Survey of Techniques for Optimizing Transformer Inference. In arXiv:2307.07982 [cs.LG].
3. Matthieu Courbariaux and Yoshua Bengio. 2016. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. CoRR abs/1602.02830(2016). arXiv:1602.02830 http://arxiv.org/abs/1602.02830
4. Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai Thomas Unterthiner Mostafa Dehghani Matthias Minderer Georg Heigold Sylvain Gelly Jakob Uszkoreit and Neil Houlsby. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In arXiv:2010.11929 [cs.CV].
5. Jonathan Frankle and Michael Carbin. 2019. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In Proceedings of the 7th International Conference on Learning Representations (ICLR’19).