<scp>Virtuoso</scp> : Energy- and Latency-aware Streamlining of Streaming Videos on Systems-on-Chips-Reference-Cited by-同舟云学术

Virtuoso : Energy- and Latency-aware Streamlining of Streaming Videos on Systems-on-Chips

Published:2023-04-03 Issue:3 Volume:28 Page:1-32
ISSN:1084-4309
Container-title:ACM Transactions on Design Automation of Electronic Systems
language:en
Short-container-title:ACM Trans. Des. Autom. Electron. Syst.

Author:

Lee Jayoung¹^ORCID,Wang Pengcheng¹^ORCID,Xu Ran¹^ORCID,Jain Sarthak²^ORCID,Dasari Venkat³^ORCID,Weston Noah³^ORCID,Li Yin⁴^ORCID,Bagchi Saurabh¹^ORCID,Chaterji Somali¹^ORCID

Affiliation:

1. Purdue University, West Lafayette, Indiana

2. Los Gatos High School, CA

3. Army Research Lab, Adelphi, MD

4. University of Wisconsin at Madison, WI

Abstract

Efficient and adaptive computer vision systems have been proposed to make computer vision tasks, such as image classification and object detection, optimized for embedded or mobile devices. These solutions, quite recent in their origin, focus on optimizing the model (a deep neural network) or the system by designing an adaptive system with approximation knobs. Despite several recent efforts, we show that existing solutions suffer from two major drawbacks. First , while mobile devices or systems-on-chips usually come with limited resources including battery power, most systems do not consider the energy consumption of the models during inference. Second , they do not consider the interplay between the three metrics of interest in their configurations, namely, latency, accuracy, and energy. In this work, we propose an efficient and adaptive video object detection system— Virtuoso , which is jointly optimized for accuracy, energy efficiency, and latency. Underlying Virtuoso is a multi-branch execution kernel that is capable of running at different operating points in the accuracy-energy-latency axes, and a lightweight runtime scheduler to select the best fit execution branch to satisfy the user requirement. We position this work as a first step in understanding the suitability of various object detection kernels on embedded boards in the accuracy-latency-energy axes, opening the door for further development in solutions customized to embedded systems and for benchmarking such solutions. Virtuoso is able to achieve up to 286 FPS on the NVIDIA Jetson AGX Xavier board, which is up to 45× faster than the baseline EfficientDet D3 and 15× faster than the baseline EfficientDet D0. In addition, we also observe up to 97.2% energy reduction using Virtuoso compared to the baseline YOLO (v3)—a widely used object detector designed for mobiles. To fairly compare with Virtuoso , we benchmark 15 state-of-the-art or widely used protocols, including Faster R-CNN (FRCNN) [NeurIPS’15], YOLO v3 [CVPR’16], SSD [ECCV’16], EfficientDet [CVPR’20], SELSA [ICCV’19], MEGA [CVPR’20], REPP [IROS’20], FastAdapt [EMDL’21], and our in-house adaptive variants of FRCNN+, YOLO+, SSD+, and EfficientDet+ (our variants have enhanced efficiency for mobiles). With this comprehensive benchmark, Virtuoso has shown superiority to all the above protocols, leading the accuracy frontier at every efficiency level on NVIDIA Jetson mobile GPUs. Specifically, Virtuoso has achieved an accuracy of 63.9%, which is more than 10% higher than some of the popular object detection models, FRCNN at 51.1% and YOLO at 49.5%.

Funder

National Science Foundation

Amazon Research Award

Army Research Lab

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications

Link

https://dl.acm.org/doi/pdf/10.1145/3564289

Reference54 articles.

1. Kittipat Apicharttrisorn, Xukan Ran, Jiasi Chen, Srikanth V. Krishnamurthy, and Amit K. Roy-Chowdhury. 2019. Frugal following: Power thrifty object detection and tracking for mobile augmented reality. In Proceedings of the 17th Conference on Embedded Networked Sensor Systems. 96–109.

2. A survey on 3D object detection methods for autonomous driving applications;Arnold Eduardo;IEEE Trans. Intell. Transport. Syst.,2019

3. Seung-Hwan Bae and Kuk-Jin Yoon. 2014. Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR’14). 1218–1225.

4. Mark Buckler, Suren Jayasuriya, and Adrian Sampson. 2017. Reconfiguring the imaging pipeline for computer vision. In Proceedings of the IEEE International Conference on Computer Vision. 975–984.

5. Bo Chen, Golnaz Ghiasi, Hanxiao Liu, Tsung-Yi Lin, Dmitry Kalenichenko, Hartwig Adam, and Quoc V. Le. 2020. MnasFPN: Learning latency-aware pyramid architecture for object detection on mobile devices. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 13607–13616.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adaptive object detection algorithms for resource constrained autonomous robotic systems;Disruptive Technologies in Information Sciences VIII;2024-06-06