φunit: A Lightweight Module for Feature Fusion Based on Their Dimensions-Reference-Cited by-同舟云学术

φunit: A Lightweight Module for Feature Fusion Based on Their Dimensions

Published:2023-11-23 Issue:23 Volume:13 Page:12621
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Long Zhengyu¹²,Zhou Rigui¹²^ORCID,Li Yaochong¹²,Ren Pengju¹²,Yang Xue¹²,Cai Shuo¹²

Affiliation:

1. College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China

2. Research Center of Intelligent Information Processing and Quantum Intelligent Computing, Shanghai 201306, China

Abstract

With the popularity of mobile devices, lightweight deep learning models have important value in various application scenarios. However, how to effectively fuse the feature information from different dimensions while ensuring the model’s lightness and high accuracy is a problem that has not been fully solved. In this paper, we propose a novel feature fusion module, called φunit, which can fuse the features extracted by different dimensional networks according to the order of feature information with a small computational cost, avoiding the problems of information fragmentation caused by simple feature stacking in traditional information fusion. Based on φunit, this paper further builds an extremely lightweight model φNet, which can achieve performance close to the highest accuracy on several public datasets under the condition of very limited parameter scale. The core idea of φunit is to use deconvolution to reduce the discrepancy among the features to be fused, and to lower the possibility of feature information fragmentation after fusion by fusing the features from different dimensions sequentially. φNet is a lightweight network composed of multiple φunits and bottleneck modules, with a parameter scale of only 1.24 M, much smaller than traditional lightweight models. This paper conducts experiments on public datasets, and φNet achieves an accuracy of 71.64% on the food101 dataset, and an accuracy of 75.31% on the random 50-category food101 dataset, both higher than or close to the highest accuracy. This paper provides a new idea and method for feature fusion of lightweight models, and also provides an efficient model selection for deep learning applications on mobile devices.

Funder

National Key R&D Plan

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/23/12621/pdf

Reference54 articles.

1. Zhang, X., Li, S., Li, X., Huang, P., Shan, J., and Chen, T. (2023, January 18–22). DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

2. Liu, Z., Zhou, Y., Xu, Y., and Wang, Z. (2023, January 18–22). Simplenet: A simple network for image anomaly detection and localization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

3. Jiang, H., Dang, Z., Wei, Z., Xie, J., Yang, J., and Salzmann, M. (2023, January 18–22). Robust Outlier Rejection for 3D Registration with Variational Bayes. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

4. Zhang, J., Liu, R., Shi, H., Yang, K., Reiß, S., Peng, K., Fu, H., Wang, K., and Stiefelhagen, R. (2023, January 18–22). Delivering Arbitrary-Modal Semantic Segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

5. Ru, L., Zheng, H., Zhan, Y., and Du, B. (2023, January 18–22). Token contrast for weakly-supervised semantic segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.