PMIndoor: Pose Rectified Network and Multiple Loss Functions for Self-Supervised Monocular Indoor Depth Estimation-Reference-Cited by-同舟云学术

PMIndoor: Pose Rectified Network and Multiple Loss Functions for Self-Supervised Monocular Indoor Depth Estimation

Published:2023-10-30 Issue:21 Volume:23 Page:8821
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Chen Siyu¹²^ORCID,Zhu Ying²,Liu Hong²

Affiliation:

1. Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing 100083, China

2. Key Laboratory of Machine Perception, Shenzhen Graduate School, Peking University, Shenzhen 518055, China

Abstract

Self-supervised monocular depth estimation, which has attained remarkable progress for outdoor scenes in recent years, often faces greater challenges for indoor scenes. These challenges comprise: (i) non-textured regions: indoor scenes often contain large areas of non-textured regions, such as ceilings, walls, floors, etc., which render the widely adopted photometric loss as ambiguous for self-supervised learning; (ii) camera pose: the sensor is mounted on a moving vehicle in outdoor scenes, whereas it is handheld and moves freely in indoor scenes, which results in complex motions that pose challenges for indoor depth estimation. In this paper, we propose a novel self-supervised indoor depth estimation framework-PMIndoor that addresses these two challenges. We use multiple loss functions to constrain the depth estimation for non-textured regions. We introduce a pose rectified network that only estimates the rotation transformation between two adjacent frames of images for the camera pose problem, and improves the pose estimation results with the pose rectified network loss. We also incorporate a multi-head self-attention module in the depth estimation network to enhance the model’s accuracy. Extensive experiments are conducted on the benchmark indoor dataset NYU Depth V2, demonstrating that our method achieves excellent performance and is better than previous state-of-the-art methods.

Funder

National Natural Science Foundation of China

Shenzhen Fundamental Research Program

Science and Technology Plan of Shenzhen

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/21/8821/pdf

Reference34 articles.

1. Towards real-time monocular depth estimation for robotics: A survey;Dong;IEEE Trans. Intell. Transp. Syst.,2022

2. Walz, S., Gruber, T., Ritter, W., and Dietmayer, K. (2020, January 20–23). Uncertainty depth estimation with gated images for 3D reconstruction. Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece.

3. LANet: Stereo matching network based on linear-attention mechanism for depth estimation optimization in 3D reconstruction of inter-forest scene;Liu;Front. Plant Sci.,2022

4. Xue, F., Zhuo, G., Huang, Z., Fu, W., Wu, Z., and Ang, M.H. (2020, January 25–29). Toward hierarchical self-supervised monocular absolute depth estimation for autonomous driving applications. Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.

5. Kalia, M., Navab, N., and Salcudean, T. (2019, January 20–24). A real-time interactive augmented reality depth estimation technique for surgical robotics. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MBUDepthNet: Real-Time Unsupervised Monocular Depth Estimation Method for Outdoor Scenes;IEEE Access;2024