Author:
Huang Hongtao,Tian Xiaofeng,Tian Wei
Abstract
Abstract: Fisheye cameras, valued for their wide field of view, play a crucial role in perceiving the surrounding environment of vehicles. However, there is a lack of specific research addressing the processing of significant distortion features in segmenting fish-eye images. Additionally, fish-eye images for autonomous driving face the challenge of few datasets, potentially causing over fitting and hindering the model's generalization ability.
Based on the semantic segmentation task, a method for transforming normal images into fish-eye images is proposed, which expands the fish-eye image dataset. By employing the Transformer network and the Across Feature Map Attention, the segmentation performance is further improved, achieving a 55.6% mIOU on Woodscape. Additionally, leveraging the concept of knowledge distillation, the network ensures a strong generalization based on dual-domain learning without compromising performance on Woodscape (54% mIOU).
Reference20 articles.
1. Kumar, Varun Ravi, et al. “Surround-View Fisheye Camera Perception for Automated Driving: Overview, Survey & Challenges.” IEEE Transactions on Intelligent Transportation Systems 24 (2022): 3638-3659. https://doi.org/10.1109/TITS.2023.3235057
2. Ekkat, Ahmed Rida et al. “SynWoodScape: Synthetic Surround-View Fisheye Camera Dataset for Autonomous Driving.” IEEE Robotics and Automation Letters 7 (2022): 8502-8509. https://doi.org/10.48550/arXiv.2203.05056
3. Shelhamer, Evan et al. “Fully convolutional networks for semantic segmentation.” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014): 3431-3440. https://doi.org/10.48550/arXiv.1411.4038
4. Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer International Publishing, 2015. https://doi.org/10.48550/arXiv.1505.04597
5. Zhao, Hengshuang et al. “Pyramid Scene Parsing Network.” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016): 6230-6239. https://doi.org/10.48550/arXiv.1612.01105