Occlusion Boundary Prediction and Transformer Based Depth-Map Refinement From Single Image

Author:

Hambarde Praful1,Wadhwa Gourav2,Vipparthi Santosh Kumar1,Murala Subrahmanyam3,Dhall Abhinav4

Affiliation:

1. Computer Vision and Pattern Recognition Lab, Indian Institute of Technology Ropar, India

2. ByteDance, Singapore

3. School of Computer Science and Statistics, Trinity College Dublin, Ireland

4. Flinders University Adelaide, Australia

Abstract

Due to the numerous applications of boundary maps and occlusion orientation maps (ORI-maps) in high-level vision problems, accurate estimation of these maps is a crucial task. The existing deep networks employ a single-stream network to estimate the relation between boundary map and ORI-map estimation. However, these networks fail to explore significant individual information separately. To resolve this problem, in this paper, we propose a novel two-stream generative adversarial network (GAN) for boundary map and ORI-map estimation, named OBP-GAN. The proposed OBP-GAN consists of two streams known as BP-GAN and OR-GAN. The BP-GAN estimates the boundary map, and the OR-GAN predicts the ORI-map. The boundary and ORI-map can also be useful cues for the task of depth-map refinement from single images. Therefore, in this work, we propose a transformer-based depth-map refinement network (TRANSDMR-GAN) for refining the depth estimated from monocular images using boundary and ORI-map. We conducted extensive analyses on indoor and outdoor datasets to validate our proposed OBP-GAN and TRANSDMR-GAN. The extensive experimental analysis and ablation study demonstrate the ability of the proposed OBP-GAN to generate state-of-the-art occlusion boundary maps. Furthermore, we show that the proposed network, TRANSDMR-GAN, can generate an edge-enhanced depth map without degrading the accuracy of the initial depth map.

Publisher

Association for Computing Machinery (ACM)

Reference69 articles.

1. Sahirzeeshan Ali and Anant Madabhushi . 2012. An integrated region-, boundary-, shape-based active contour for multiple object overlap resolution in histological imagery . IEEE transactions on medical imaging 31, 7 ( 2012 ), 1448–1460. Sahirzeeshan Ali and Anant Madabhushi. 2012. An integrated region-, boundary-, shape-based active contour for multiple object overlap resolution in histological imagery. IEEE transactions on medical imaging 31, 7 (2012), 1448–1460.

2. Emanuel Ben-Baruch Tal Ridnik Nadav Zamir Asaf Noy Itamar Friedman Matan Protter and Lihi Zelnik-Manor. 2021. Asymmetric Loss For Multi-Label Classification. arXiv preprint arXiv:2009.14119(2021). Emanuel Ben-Baruch Tal Ridnik Nadav Zamir Asaf Noy Itamar Friedman Matan Protter and Lihi Zelnik-Manor. 2021. Asymmetric Loss For Multi-Label Classification. arXiv preprint arXiv:2009.14119(2021).

3. Sandesh Bhagat , Manesh Kokare , Vineet Haswani , Praful Hambarde , and Ravi Kamble . 2021 . WheatNet-lite: a novel light weight network for wheat head detection . In Proceedings of the IEEE/CVF international conference on computer vision. 1332–1341 . Sandesh Bhagat, Manesh Kokare, Vineet Haswani, Praful Hambarde, and Ravi Kamble. 2021. WheatNet-lite: a novel light weight network for wheat head detection. In Proceedings of the IEEE/CVF international conference on computer vision. 1332–1341.

4. Yifeng Chen , Guangchen Lin , Songyuan Li , Omar Bourahla , Yiming Wu , Fangfang Wang , Junyi Feng , Mingliang Xu , and Xi Li . 2020 . BANet: Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3793–3802 . Yifeng Chen, Guangchen Lin, Songyuan Li, Omar Bourahla, Yiming Wu, Fangfang Wang, Junyi Feng, Mingliang Xu, and Xi Li. 2020. BANet: Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3793–3802.

5. Ying-Cong Chen , Xiaogang Xu , and Jiaya Jia . 2020 . Domain adaptive image-to-image translation . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5274–5283 . Ying-Cong Chen, Xiaogang Xu, and Jiaya Jia. 2020. Domain adaptive image-to-image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5274–5283.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3