PATCH-Reference-Cited by-同舟云学术

PATCH

Published:2023-09-27 Issue:3 Volume:7 Page:1-24
ISSN:2474-9567
Container-title:Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
language:en
Short-container-title:Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.

Author:

Wang Juexing¹^ORCID,Wang Guangjing¹^ORCID,Zhang Xiao¹^ORCID,Liu Li¹^ORCID,Zeng Huacheng¹^ORCID,Xiao Li¹^ORCID,Cao Zhichao¹^ORCID,Gu Lin²^ORCID,Li Tianxing¹^ORCID

Affiliation:

1. Michigan State University, East Lansing, Michigan, USA

2. RIKEN AIP, Tokoyo, Tokoyo, JAPAN and The University of Tokyo, Tokoyo, Tokoyo, JAPAN

Abstract

Recent advancements in deep learning have shown that multimodal inference can be particularly useful in tasks like autonomous driving, human health, and production line monitoring. However, deploying state-of-the-art multimodal models in distributed IoT systems poses unique challenges since the sensor data from low-cost edge devices can get corrupted, lost, or delayed before reaching the cloud. These problems are magnified in the presence of asymmetric data generation rates from different sensor modalities, wireless network dynamics, or unpredictable sensor behavior, leading to either increased latency or degradation in inference accuracy, which could affect the normal operation of the system with severe consequences like human injury or car accident. In this paper, we propose PATCH, a framework of speculative inference to adapt to these complex scenarios. PATCH serves as a plug-in module in the existing multimodal models, and it enables speculative inference of these off-the-shelf deep learning models. PATCH consists of 1) a Masked-AutoEncoder-based cross-modality imputation module to impute missing data using partially-available sensor data, 2) a lightweight feature pair ranking module that effectively limits the searching space for the optimal imputation configuration with low computation overhead, and 3) a data alignment module that aligns multimodal heterogeneous data streams without using accurate timestamp or external synchronization mechanisms. We implement PATCH in nine popular multimodal models using five public datasets and one self-collected dataset. The experimental results show that PATCH achieves up to 13% mean accuracy improvement over the state-of-art method while only using 10% of training data and reducing the training overhead by 73% compared to the original cost of retraining the model.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture,Human-Computer Interaction

Link

https://dl.acm.org/doi/pdf/10.1145/3610885

Reference97 articles.

1. John Aach and George M . Church . 2001 . Aligning gene expression time series with time warping algorithms. Bioinformatics 17, 6 (06 2001), 495--508. John Aach and George M. Church. 2001. Aligning gene expression time series with time warping algorithms. Bioinformatics 17, 6 (06 2001), 495--508.

2. Davide Anguita , Alessandro Ghio , Luca Oneto , Xavier Parra Perez , and Jorge Luis Reyes Ortiz . 2013 . A public domain dataset for human activity recognition using smartphones . In Proceedings of the 21th international European symposium on artificial neural networks, computational intelligence and machine learning. 437--442 . Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra Perez, and Jorge Luis Reyes Ortiz. 2013. A public domain dataset for human activity recognition using smartphones. In Proceedings of the 21th international European symposium on artificial neural networks, computational intelligence and machine learning. 437--442.

3. Ho Bae , Jaehee Jang , Dahuin Jung , Hyemi Jang , Heonseok Ha , Hyungyu Lee , and Sungroh Yoon . 2018. Security and privacy issues in deep learning. arXiv preprint arXiv:1807.11655 ( 2018 ). Ho Bae, Jaehee Jang, Dahuin Jung, Hyemi Jang, Heonseok Ha, Hyungyu Lee, and Sungroh Yoon. 2018. Security and privacy issues in deep learning. arXiv preprint arXiv:1807.11655 (2018).

4. Pierre Baldi . 2012 . Autoencoders, unsupervised learning, and deep architectures . In Proceedings of ICML workshop on unsupervised and transfer learning. JMLR Workshop and Conference Proceedings, 37--49 . Pierre Baldi. 2012. Autoencoders, unsupervised learning, and deep architectures. In Proceedings of ICML workshop on unsupervised and transfer learning. JMLR Workshop and Conference Proceedings, 37--49.

5. Hritik Bansal , Nishad Singhi , Yu Yang , Fan Yin , Aditya Grover , and Kai-Wei Chang . 2023. CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning. arXiv preprint arXiv:2303.03323 ( 2023 ). Hritik Bansal, Nishad Singhi, Yu Yang, Fan Yin, Aditya Grover, and Kai-Wei Chang. 2023. CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning. arXiv preprint arXiv:2303.03323 (2023).

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. SoilCares: Towards Low-cost Soil Macronutrients and Moisture Monitoring Using RF-VNIR Sensing;Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services;2024-06-03

2. Low-latency MLLM Inference with Spatiotemporal Heterogeneous Distributed Multimodal Data;2024 IEEE Coupling of Sensing & Computing in AIoT Systems (CSCAIoT);2024-05-13