Affiliation:
1. College of Big Data and Software Engineering, Zhejiang Wanli University, Ningbo 315100, China
2. College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China
Abstract
Image semantic segmentation as a kind of technology has been playing a crucial part in intelligent driving, medical image analysis, video surveillance, and AR. However, since the scene needs to infer more semantics from video and audio clips and the request for real-time performance becomes stricter, whetherthe single-label classification method that was usually used before or the regular manual labeling cannot meet this end. Given the excellent performance of deep learning algorithms in extensive applications, the image semantic segmentation algorithm based on deep learning framework has been brought under the spotlight of development. This paper attempts to improve the ESPNet (Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation) based on the multilabel classification method by the following steps. First, the standard convolution is replaced by applying Receptive Field in Deep Convolutional Neural Network in the convolution layer, to the extent that every pixel in the covered area would facilitate the ultimate feature response. Second, the ASPP (Atrous Spatial Pyramid Pooling) module is improved based on the atrous convolution, and the DB-ASPP (Delate Batch Normalization-ASPP) is proposed as a way to reducing gridding artifacts due to the multilayer atrous convolution, acquiring multiscale information, and integrating the feature information in relation to the image set. Finally, the proposed model and regular models are subject to extensive tests and comparisons on a plurality of multiple data sets. Results show that the proposed model demonstrates a good accuracy of segmentation, the smallest network parameter at 0.3 M and the fastest speed of segmentation at 25 FPS.
Funder
National Natural Science Foundation of China
Subject
General Engineering,General Mathematics
Reference19 articles.
1. Image segmentation using deep learning: a survey;S. Minaee;Computer Vision and Pattern Recognition,2020
2. Visualizing and Understanding Convolutional Networks
3. A 3D CNN-LSTM-Based Image-to-Image Foreground Segmentation
4. An end-to-end edge aggregation network for moving object segmentation;P. W. Patil
5. IEEE Transactions on Vehicular Technology publication information