Fully Cross-Attention Transformer for Guided Depth Super-Resolution-Reference-Cited by-同舟云学术

Fully Cross-Attention Transformer for Guided Depth Super-Resolution

Published:2023-03-02 Issue:5 Volume:23 Page:2723
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Ariav Ido¹^ORCID,Cohen Israel¹^ORCID

Affiliation:

1. Andrew and Erna Viterbi Faculty of Electrical and Computer Engineering, Technion—Israel Institute of Technology, Haifa 3200003, Israel

Abstract

Modern depth sensors are often characterized by low spatial resolution, which hinders their use in real-world applications. However, the depth map in many scenarios is accompanied by a corresponding high-resolution color image. In light of this, learning-based methods have been extensively used for guided super-resolution of depth maps. A guided super-resolution scheme uses a corresponding high-resolution color image to infer high-resolution depth maps from low-resolution ones. Unfortunately, these methods still have texture copying problems due to improper guidance from color images. Specifically, in most existing methods, guidance from the color image is achieved by a naive concatenation of color and depth features. In this paper, we propose a fully transformer-based network for depth map super-resolution. A cascaded transformer module extracts deep features from a low-resolution depth. It incorporates a novel cross-attention mechanism to seamlessly and continuously guide the color image into the depth upsampling process. Using a window partitioning scheme, linear complexity in image resolution can be achieved, so it can be applied to high-resolution images. The proposed method of guided depth super-resolution outperforms other state-of-the-art methods through extensive experiments.

Funder

PMRI—Peter Munk Research Institute-Technion

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/5/2723/pdf

Reference63 articles.

1. Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., and Davison, A. (2011, January 16–19). KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, Santa Barbara, CA, USA.

2. Schamm, T., Strand, M., Gumpp, T., Kohlhaas, R., Zollner, J.M., and Dillmann, R. (2009, January 22–26). Vision and ToF-based driving assistance for a personal transporter. Proceedings of the 2009 International Conference on Advanced Robotics, Munich, Germany.

3. Hierarchical features driven residual learning for depth map super-resolution;Guo;IEEE Trans. Image Process.,2018

4. Hui, T.W., Loy, C.C., and Tang, X. (2016, January 11–14). Depth map super-resolution by deep multi-scale guidance. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.

5. Riegler, G., Rüther, M., and Bischof, H. (2016, January 11–14). Atgv-net: Accurate depth super-resolution. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Cascaded Degradation-Aware Blind Super-Resolution;Sensors;2023-06-05

2. PCB Defect Images Super-Resolution Reconstruction Based on Improved SRGAN;Applied Sciences;2023-06-02