SliceSamp: A Promising Downsampling Alternative for Retaining Information in a Neural Network
-
Published:2023-10-25
Issue:21
Volume:13
Page:11657
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
He Lianlian1, Wang Ming2ORCID
Affiliation:
1. School of Mathematics and Statistics, Hubei University of Education, Wuhan 430205, China 2. School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
Abstract
Downsampling, which aims to improve computational efficiency by reducing the spatial resolution of feature maps, is a critical operation in neural networks. Many downsampling methods have been proposed to address the challenge of retaining feature map information. However, some detailed information is still lost, even though these methods can extract features with stronger semantics. In this paper, we propose a novel downsampling method which combines feature slicing and depthwise separable convolution for information-retaining downsampling. It slices the input feature map into multiple non-overlapping sub-feature maps by using indexes with a stride of two in the spatial dimension and applies depthwise separable convolution on each slice to extract feature information. To demonstrate the effectiveness of SliceSamp, we compare it with classical downsampling methods on image classification, object detection, and semantic segmentation tasks using several benchmark datasets, including ImageNet-1K, COCO, VOC, and ADE20K. Extensive experiments demonstrate that SliceSamp outperforms classical downsampling methods with consistent improvements in various computer vision tasks. The proposed SliceSamp shows advanced model performance with lower computational costs and memory requirements. By replacing the downsampling layers in different network architectures (including ResNet (Residual Network), YOLOv5, and Swin Transformer), SliceSamp brings different degrees of performance gains (+0.54~3.64%) compared to these baseline models. Additionally, SliceUpsamp enables high-resolution feature reconstruction and alignment during upsampling. SliceSamp and SliceUpsamp can be plug-and-play-integrated into existing neural network architectures. As a promising downsampling alternative to traditional methods, SliceSamp can also provide a reference for designing lightweight and high-performance model architectures in the future.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference64 articles.
1. Deep learning;LeCun;Nature,2015 2. He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27–30). Deep residual learning for image recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. 3. He, K., Gkioxari, G., Dollar, P., and Girshick, R. (2017, January 22–29). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy. Available online: https://openaccess.thecvf.com/content_iccv_2017/html/He_Mask_R-CNN_ICCV_2017_paper.html. 4. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27–30). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. 5. Navab, N., Hornegger, J., Wells, W., and Frangi, A. (2015). Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Springer International Publishing.
|
|