Self-supervised recurrent depth estimation with attention mechanisms-Reference-Cited by-同舟云学术

Self-supervised recurrent depth estimation with attention mechanisms

Published:2022-01-31 Issue: Volume:8 Page:e865
ISSN:2376-5992
Container-title:PeerJ Computer Science
language:en
Short-container-title:

Author:

Makarov Ilya¹²³,Bakhanova Maria¹,Nikolenko Sergey⁴⁵,Gerasimova Olga¹

Affiliation:

1. HSE University, Moscow, Russia

2. Artificial Intelligence Research Institute (AIRI), Moscow, Russia

3. Big Data Research Center, National University of Science and Technology MISIS, Moscow, Russia

4. Steklov Institute of Mathematics at St. Petersburg, St. Petersburg, Russia

5. St. Petersburg State University, St. Petersburg, Russia

Abstract

Depth estimation has been an essential task for many computer vision applications, especially in autonomous driving, where safety is paramount. Depth can be estimated not only with traditional supervised learning but also via a self-supervised approach that relies on camera motion and does not require ground truth depth maps. Recently, major improvements have been introduced to make self-supervised depth prediction more precise. However, most existing approaches still focus on single-frame depth estimation, even in the self-supervised setting. Since most methods can operate with frame sequences, we believe that the quality of current models can be significantly improved with the help of information about previous frames. In this work, we study different ways of integrating recurrent blocks and attention mechanisms into a common self-supervised depth estimation pipeline. We propose a set of modifications that utilize temporal information from previous frames and provide new neural network architectures for monocular depth estimation in a self-supervised manner. Our experiments on the KITTI dataset show that proposed modifications can be an effective tool for exploiting temporal information in a depth prediction pipeline.

Funder

HSE University Basic Research Program

Publisher

PeerJ

Subject

General Computer Science

Link

https://peerj.com/articles/cs-865.pdf

Reference65 articles.

1. Delving deeper into convolutional networks for learning video representations;Ballas,2015

2. Estimating depth from monocular images as classification using deep fully convolutional residual networks;Cao,2016

3. Depth prediction without the sensors: leveraging structure for unsupervised learning from monocular videos;Casser,2018

4. Learning phrase representations using RNN encoder-decoder for statistical machine translation;Cho,2014

5. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture;Eigen,2014

Cited by 27 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Polarimetric Imaging for Robot Perception: A Review;Sensors;2024-07-09

2. mini-Unet GAN: Optimized GAN for Monocular Depth Estimation;2024 6th International Conference on Pattern Analysis and Intelligent Systems (PAIS);2024-04-24

3. Inpainting Semantic and Depth Features to Improve Visual Place Recognition in the Wild;IEEE Access;2024

4. Attention U-Net Oriented Towards 3D Depth Estimation;Lecture Notes in Networks and Systems;2024

5. Application of Multimodal Machine Learning for Image Recommendation Systems;Communications in Computer and Information Science;2024