Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks-Reference-Cited by-同舟云学术

Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks

Published:2023-11-23 Issue:1 Volume:33 Page:1-29
ISSN:1049-331X
Container-title:ACM Transactions on Software Engineering and Methodology
language:en
Short-container-title:ACM Trans. Softw. Eng. Methodol.

Author:

Weiss Michael¹^ORCID,Tonella Paolo¹^ORCID

Affiliation:

1. Università della Svizzera italiana, Switzerland

Abstract

Recent decades have seen the rise of large-scale Deep Neural Networks (DNNs) to achieve human-competitive performance in a variety of AI tasks. Often consisting of hundreds of million, if not hundreds of billion, parameters, these DNNs are too large to be deployed to or efficiently run on resource-constrained devices such as mobile phones or Internet of Things microcontrollers. Systems relying on large-scale DNNs thus have to call the corresponding model over the network, leading to substantial costs for hosting and running the large-scale remote model, costs which are often charged on a per-use basis. In this article, we propose BiSupervised , a novel architecture, where, before relying on a large remote DNN, a system attempts to make a prediction on a small-scale local model. A DNN supervisor monitors said prediction process and identifies easy inputs for which the local prediction can be trusted. For these inputs, the remote model does not have to be invoked, thus saving costs while only marginally impacting the overall system accuracy. Our architecture furthermore foresees a second supervisor to monitor the remote predictions and identify inputs for which not even these can be trusted, allowing to raise an exception or run a fallback strategy instead. We evaluate the cost savings and the ability to detect incorrectly predicted inputs on four diverse case studies: IMDb movie review sentiment classification, GitHub issue triaging, ImageNet image classification, and SQuADv2 free-text question answering. In all four case studies, we find that BiSupervised allows to reduce cost by at least 30% while maintaining similar system-level prediction performance. In two case studies (IMDb and SQuADv2), we find that BiSupervised even achieves a higher system-level accuracy, at reduced cost, compared to a remote-only model. Furthermore, measurements taken on our setup indicate a large potential of BiSupervised to reduce average prediction latency.

Publisher

Association for Computing Machinery (ACM)

Subject

Software

Link

https://dl.acm.org/doi/pdf/10.1145/3617593

Reference75 articles.

1. Security and privacy issues in deep learning;Bae Ho;arXiv preprint arXiv:1807.11655,2018

2. Cats are not fish

3. Language models are few-shot learners;Brown Tom;Advances in Neural Information Processing Systems,2020

4. Computation offloading in Edge Computing environments using Artificial Intelligence techniques

5. Ferhat Ozgur Catak Tao Yue and Shaukat Ali. 2021. Prediction surface uncertainty quantification in object detection models for autonomous driving. arXiv:2107.04991v1 [cs.CV] (2021).