1. Armeni, I., Sax, S., Zamir, A.R., Savarese, S., 2017. Joint 2d-3d-semantic data for indoor scene understanding. arXiv preprint arXiv:1702.01105.
2. Bhoi, A., 2019. Monocular depth estimation: A survey. arXiv preprint arXiv:1901.09402.
3. Chang, A., Dai, A., Funkhouser, T., Halber, M., Niebner, M., Savva, M., Song, S., Zeng, A., Zhang, Y., 2018. Matterport3d: Learning from rgb-d data in indoor environments, in: 7th IEEE International Conference on 3D Vision (3DV), Institute of Electrical and Electronics Engineers Inc.. pp. 667–676.
4. Chen, Y., Fan, H., Xu, B., Yan, Z., Kalantidis, Y., Rohrbach, M., Yan, S., Feng, J., 2019. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3435–3444.
5. Depth map prediction from a single image using a multi-scale deep network;Eigen;Advances in neural information processing systems,2014