A Low-Cost Detail-Aware Neural Network Framework and Its Application in Mask Wearing Monitoring
-
Published:2023-08-29
Issue:17
Volume:13
Page:9747
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Cao Silei1, Long Shun1, Liao Fangting1
Affiliation:
1. College of Information Science and Technology, Jinan University, Guangzhou 510632, China
Abstract
The use of deep learning techniques in real-time monitoring can save a lot of manpower in various scenarios. For example, mask-wearing is an effective measure to prevent COVID-19 and other respiratory diseases, especially for vulnerable populations such as children, the elderly, and people with underlying health problems. Currently, many public places such as hospitals, nursing homes, social service facilities, and schools experiencing outbreaks require mandatory mask-wearing. However, most of the terminal devices currently available have very limited GPU capability to run large neural networks. This means that we have to keep the parameter size of a neural network modest while maintaining its performance. In this paper, we propose a framework that applies deep learning techniques to real-time monitoring and uses it for the real-time monitoring of mask-wearing status. The main contributions are as follows: First, a feature fusion technique called skip layer pooling fusion (SLPF) is proposed for image classification tasks. It fully utilizes both deep and shallow features of a convolutional neural network while minimizing the growth in model parameters caused by feature fusion. On average, this technique improves the accuracy of various neural network models by 4.78% and 5.21% on CIFAR100 and Tiny-ImageNet, respectively. Second, layer attention (LA), an attention mechanism tailor-made for feature fusion, is proposed. Since different layers of convolutional neural networks make different impacts on the final prediction results, LA learns a set of weights to better enhance the contribution of important convolutional layer features. On average, it improves the accuracy of various neural network models by 2.10% and 2.63% on CIFAR100 and Tiny-ImageNet, respectively. Third, a MobileNetv2-based lightweight mask-wearing status classification model is trained, which is suitable for deployment on mobile devices and achieves an accuracy of 95.49%. Additionally, a ResNet mask-wearing status classification model is trained, which has a larger model size but achieves high accuracy of 98.14%. By applying the proposed methods to the ResNet mask-wearing status classification model, the accuracy is improved by 1.58%. Fourth, a mask-wearing status detection model is enhanced based on YOLOv5 with a spatial-frequency fusion module resulting in a mAP improvement of 2.20%. Overall, this paper presents various techniques to improve the performance of neural networks and apply them to mask-wearing status monitoring, which can help stop pandemics.
Funder
Joint Research Fund in Astronomy National Natural Science Foundation of China and the Chinese Academy of Sciences Guangdong Basic and Applied Basic Research Foundation
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference40 articles.
1. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18–22). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. 2. Zhu, X., Lyu, S., Wang, X., and Zhao, Q. (2021, January 11–17). TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada. 3. He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27–30). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA. 4. He, K., Zhang, X., Ren, S., and Sun, J. (2016). Computer Vision–ECCV 2016, Proceedings of the 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016, Springer. Proceedings, Part IV 14. 5. Zagoruyko, S., and Komodakis, N. (2016). Wide residual networks. arXiv.
|
|