Monocular Depth Estimation from a Fisheye Camera Based on Knowledge Distillation
Author:
Son Eunjin1ORCID, Choi Jiho1ORCID, Song Jimin1ORCID, Jin Yongsik2ORCID, Lee Sang Jun1ORCID
Affiliation:
1. Division of Electronic Engineering, Jeonbuk National University, 567 Baekje-daero, Deokjin-gu, Jeonju 54896, Republic of Korea 2. IT Convergence Research Section, Electronics and Telecommunications Research Institute (ETRI), Daegu 42995, Republic of Korea
Abstract
Monocular depth estimation is a task aimed at predicting pixel-level distances from a single RGB image. This task holds significance in various applications including autonomous driving and robotics. In particular, the recognition of surrounding environments is important to avoid collisions during autonomous parking. Fisheye cameras are adequate to acquire visual information from a wide field of view, reducing blind spots and preventing potential collisions. While there have been increasing demands for fisheye cameras in visual-recognition systems, existing research on depth estimation has primarily focused on pinhole camera images. Moreover, depth estimation from fisheye images poses additional challenges due to strong distortion and the lack of public datasets. In this work, we propose a novel underground parking lot dataset called JBNU-Depth360, which consists of fisheye camera images and their corresponding LiDAR projections. Our proposed dataset was composed of 4221 pairs of fisheye images and their corresponding LiDAR point clouds, which were obtained from six driving sequences. Furthermore, we employed a knowledge-distillation technique to improve the performance of the state-of-the-art depth-estimation models. The teacher–student learning framework allows the neural network to leverage the information in dense depth predictions and sparse LiDAR projections. Experiments were conducted on the KITTI-360 and JBNU-Depth360 datasets for analyzing the performance of existing depth-estimation models on fisheye camera images. By utilizing the self-distillation technique, the AbsRel and SILog error metrics were reduced by 1.81% and 1.55% on the JBNU-Depth360 dataset. The experimental results demonstrated that the self-distillation technique is beneficial to improve the performance of depth-estimation models.
Funder
the Ministry of Trade, Industry & Energy the Electronics and Telecommunications Research Institute
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference60 articles.
1. Gochoo, M., Otgonbold, M.E., Ganbold, E., Hsieh, J.W., Chang, M.C., Chen, P.Y., Dorj, B., Al Jassmi, H., Batnasan, G., and Alnajjar, F. (2023, January 18–22). FishEye8K: A Benchmark and Dataset for Fisheye Camera Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada. 2. Tezcan, O., Duan, Z., Cokbas, M., Ishwar, P., and Konrad, J. (2022, January 3–7). Wepdtof: A dataset and benchmark algorithms for in-the-wild people detection and tracking from overhead fisheye cameras. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA. 3. Yogamani, S., Hughes, C., Horgan, J., Sistu, G., Varley, P., O’Dea, D., Uricár, M., Milz, S., Simon, M., and Amende, K. (November, January 27). Woodscape: A multi-task, multi-camera fisheye dataset for autonomous driving. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea. 4. Vision-based parking-slot detection: A DCNN-based approach and a large-scale benchmark dataset;Zhang;IEEE Trans. Image Process.,2018 5. Wu, Y., Yang, T., Zhao, J., Guan, L., and Jiang, W. (2018, January 26–30). VH-HFCN based parking slot and lane markings segmentation on panoramic surround view. Proceedings of the 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, China.
|
|