3D layout estimation of general rooms based on ordinal semantic segmentation-Reference-Cited by-同舟云学术

3D layout estimation of general rooms based on ordinal semantic segmentation

Published:2023-01-27 Issue:8 Volume:17 Page:855-868
ISSN:1751-9632
Container-title:IET Computer Vision
language:en
Short-container-title:IET Computer Vision

Author:

Yao Hui¹^ORCID,Miao Jun¹,Zhang Guoxiang²,Chu Jun¹

Affiliation:

1. Institute of Computer Vision Nanchang Hangkong University Nanchang Jiangxi China

2. University of California Merced California USA

Abstract

AbstractRoom layout estimation aims to predict the location and range of layout planes of interior spaces. Previous works treat each layout plane as an independent individual without considering the ordinal relation between walls, resulting the loss of the wall planes and the lack of integrity. This paper proposes a novel two‐branch neural networks model to estimate 3D layouts of cuboid and non‐cuboid room types. The model embeds the ordinal relation between layout planes into the layout segmentation branch through an proposed ordinal classification loss function, and outputs both pixel‐level layout segmentation maps and layout plane parameter maps. Then, the instance‐level plane parameters of each layout plane are determined by using an instance‐aware pooling layer. Finally, the sharpness of layout edges of the 2D layout semantic segmentation map is optimized by using an improved depth map intersection algorithm. Furthermore, we annotate a large‐scale 3D room layout estimation dataset, InteriorNet‐Layout, to obtain a steady model. Experiments on synthesized real‐world datasets show that the proposed method achieves faster calculation while maintaining high accuracy. Code is available at https://github.com/Hui‐Yao/3D‐ordinal‐layout‐estimation.

Funder

National Natural Science Foundation of China

Publisher

Institution of Engineering and Technology (IET)

Subject

Computer Vision and Pattern Recognition,Software

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1049/cvi2.12149

Reference38 articles.

1. Nie Y. et al.:Total3dunderstanding: joint layout object pose and mesh reconstruction for indoor scenes from a single image. In:Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp.55–64(2020)

2. Learning to navigate in complex environments;Mirowski P.;arXiv preprint arXiv:1611.03673,2016

3. Deep image homography estimation;Detone D.;arXiv preprint arXiv:1606.03798,2016

4. Hedau V. Hoiem D. Forsyth D.:Thinking inside the box: using appearance models and context based on room geometry. In:European Conference on Computer Vision pp.224–237(2010)

5. The Manhattan world assumption: regularities in scene statistics which enable Bayesian inference;Coughlan J.;Adv. Neural Inf. Process. Syst.,2000