Deep Digging into the Generalization of Self-Supervised Monocular Depth Estimation-Reference-Cited by-同舟云学术

Deep Digging into the Generalization of Self-Supervised Monocular Depth Estimation

Published:2023-06-26 Issue:1 Volume:37 Page:187-196
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Bae Jinwoo,Moon Sungho,Im Sunghoon

Abstract

Self-supervised monocular depth estimation has been widely studied recently. Most of the work has focused on improving performance on benchmark datasets, such as KITTI, but has offered a few experiments on generalization performance. In this paper, we investigate the backbone networks (e.g., CNNs, Transformers, and CNN-Transformer hybrid models) toward the generalization of monocular depth estimation. We first evaluate state-of-the-art models on diverse public datasets, which have never been seen during the network training. Next, we investigate the effects of texture-biased and shape-biased representations using the various texture-shifted datasets that we generated. We observe that Transformers exhibit a strong shape bias and CNNs do a strong texture-bias. We also find that shape-biased models show better generalization performance for monocular depth estimation compared to texture-biased models. Based on these observations, we newly design a CNN-Transformer hybrid network with a multi-level adaptive feature fusion module, called MonoFormer. The design intuition behind MonoFormer is to increase shape bias by employing Transformers while compensating for the weak locality bias of Transformers by adaptively fusing multi-level representations. Extensive experiments show that the proposed method achieves state-of-the-art performance with various public datasets. Our method also shows the best generalization ability among the competitive methods.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 34 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multiple prior representation learning for self-supervised monocular depth estimation via hybrid transformer;Engineering Applications of Artificial Intelligence;2024-09

2. Self-supervised monocular depth estimation with self-distillation and dense skip connection;Computer Vision and Image Understanding;2024-09

3. RTIA-Mono: Real-Time Lightweight Self-Supervised Monocular Depth Estimation with Global-Local Information Aggregation;Digital Signal Processing;2024-09

4. RENA-Depth: toward recursion representation enhancement in neighborhood attention guided lightweight self-supervised monocular depth estimation;Optical Engineering;2024-08-22

5. Triple-Supervised Convolutional Transformer Aggregation for Robust Monocular Endoscopic Dense Depth Estimation;IEEE Transactions on Medical Robotics and Bionics;2024-08