Affiliation:
1. School of Computer and Information, Hefei University of Technology, Hefei, 230601, P. R. China
Abstract
Monocular depth estimation aims at inferring three-dimensional (3D) cues from a single RGB image. Although existing methods have achieved a certain degree of success, the impact of redundant information has rarely been studied. We propose to improve estimation accuracy by implicitly eliminating redundant information. To this end, we creatively apply discrete representation to monocular depth estimation. By mapping continuous variables into the corresponding learning-based discrete latent space, a hierarchical multi-scale latent map is acquired as the decoder input. Removing redundant information can enhance prediction performance by making the depth estimator balance the local and global. Furthermore, to fully take advantage of the discrete representation, a lightweight fusion mechanism is introduced to aggregate information in multi-scale feature maps.Experiments on NYU Depth V2 dataset demonstrate that our network is competitive with the state of the arts.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Electrical and Electronic Engineering,Hardware and Architecture,Electrical and Electronic Engineering,Hardware and Architecture