A Deep Joint Network for Monocular Depth Estimation Based on Pseudo-Depth Supervision

Author:

Tan Jiahai12,Gao Ming1,Duan Tao2,Gao Xiaomei3

Affiliation:

1. School of Optoelectronic Engineering, Xi’an Technological University, Xi’an 710021, China

2. State Key Laboratory of Transient Optics and Photonics, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, China

3. Xi’an Mapping and Printing of China National Administration of Coal Geology, Xi’an 710199, China

Abstract

Depth estimation from a single image is a significant task. Although deep learning methods hold great promise in this area, they still face a number of challenges, including the limited modeling of nonlocal dependencies, lack of effective loss function joint optimization models, and difficulty in accurately estimating object edges. In order to further increase the network’s prediction accuracy, a new structure and training method are proposed for single-image depth estimation in this research. A pseudo-depth network is first deployed for generating a single-image depth prior, and by constructing connecting paths between multi-scale local features using the proposed up-mapping and jumping modules, the network can integrate representations and recover fine details. A deep network is also designed to capture and convey global context by utilizing the Transformer Conv module and Unet Depth net to extract and refine global features. The two networks jointly provide meaningful coarse and fine features to predict high-quality depth images from single RGB images. In addition, multiple joint losses are utilized to enhance the training model. A series of experiments are carried out to confirm and demonstrate the efficacy of our method. The proposed method exceeds the advanced method DPT by 10% and 3.3% in terms of root mean square error (RMSE(log)) and 1.7% and 1.6% in terms of squared relative difference (SRD), respectively, according to experimental results on the NYU Depth V2 and KITTI depth estimation benchmarks.

Funder

Open Research Fund of State Key Laboratory of Transient Optics and Photonics, Chinese Academy of Sciences

Key R&D project of Shaanxi Province

Key Scientific Research Program of Shaanxi Provincial Department of Education

Xian Science and Technology Research Plan

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Reference50 articles.

1. Siddiqui, Y., Porzi, L., Bulò, S., Muller, N., Nießner, M., Dai, A., and Kontschieder, P. (2023, January 18–22). Panoptic lifting for 3d scene understanding with neural fields. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

2. ArthroNet: A monocular depth estimation technique with 3D segmented maps for knee arthroscopy;Ali;Intell. Med.,2023

3. SAM-Net: Semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications;Yang;Pattern Recognit. Lett.,2022

4. Zhou, C., Yan, Q., Shi, Y., and Sun, L. (2022, January 10–12). DoubleStar: Long-Range Attack Towards Depth Estimation based Obstacle Avoidance in Autonomous Systems. Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA.

5. (2020, April 21). Tesla Use pEr-Pixel Depth Estimation with Self-Supervised Learning. Available online: https://youtu.be/hx7BXih7zx8?t=1334.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3