Abstract
Detecting the objects surrounding a moving vehicle is essential for autonomous driving and for any kind of advanced driving assistance system; such a system can also be used for analyzing the surrounding traffic as the vehicle moves. The most popular techniques for object detection are based on image processing; in recent years, they have become increasingly focused on artificial intelligence. Systems using monocular vision are increasingly popular for driving assistance, as they do not require complex calibration and setup. The lack of three-dimensional data is compensated for by the efficient and accurate classification of the input image pixels. The detected objects are usually identified as cuboids in the 3D space, or as rectangles in the image space. Recently, instance segmentation techniques have been developed that are able to identify the freeform set of pixels that form an individual object, using complex convolutional neural networks (CNNs). This paper presents an alternative to these instance segmentation networks, combining much simpler semantic segmentation networks with light, geometrical post-processing techniques, to achieve instance segmentation results. The semantic segmentation network produces four semantic labels that identify the quarters of the individual objects: top left, top right, bottom left, and bottom right. These pixels are grouped into connected regions, based on their proximity and their position with respect to the whole object. Each quarter is used to generate a complete object hypothesis, which is then scored according to object pixel fitness. The individual homogeneous regions extracted from the labeled pixels are then assigned to the best-fitted rectangles, leading to complete and freeform identification of the pixels of individual objects. The accuracy is similar to instance segmentation-based methods but with reduced complexity in terms of trainable parameters, which leads to a reduced demand for computational resources.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference42 articles.
1. Deep residual learning for image recognition;He;Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2016
2. Fully convolutional networks for building and road extraction: Preliminary results;Zhong;Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS),2016
3. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
4. U-Net: Convolutional Networks for Biomedical Image Segmentation;Ronneberger;Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention,2015
5. You only look once: Unified, real-time object detection;Redmon;Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2016
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献