DCSPose: A Dual-Channel Siamese Framework for Unseen Textureless Object Pose Estimation-Reference-Cited by-同舟云学术

DCSPose: A Dual-Channel Siamese Framework for Unseen Textureless Object Pose Estimation

Published:2024-01-15 Issue:2 Volume:14 Page:730
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Yue Zhen¹²^ORCID,Han Zhenqi¹^ORCID,Yang Xiulong¹^ORCID,Liu Lizhuang¹

Affiliation:

1. Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201210, China

2. University of Chinese Academy of Sciences, Beijing 100049, China

Abstract

The demand for object pose estimation is steadily increasing, and deep learning has propelled the advancement of this field. However, the majority of research endeavors face challenges in their applicability to industrial production. This is primarily due to the high cost of annotating 3D data, which places higher demands on the generalization capabilities of neural network models. Additionally, existing methods struggle to handle the abundance of textureless objects commonly found in industrial settings. Finally, there is a strong demand for real-time processing capabilities in industrial production processes. Therefore, in this study, we introduced a dual-channel Siamese framework to address these challenges in industrial applications. The architecture employs a Siamese structure for template matching, enabling it to learn the matching capability between the templates constructed from high-fidelity simulated data and real-world scenes. This capacity satisfies the requirements for generalization to unseen objects. Building upon this, we utilized two feature extraction channels to separately process RGB and depth information, addressing the limited feature issue associated with textureless objects. Through our experiments, we demonstrated that this architecture effectively estimates the three-dimensional pose of objects, achieving a 6.0% to 10.9% improvement compared to the state-of-the-art methods, while exhibiting robust generalization and real-time processing capabilities.

Funder

Shanghai Science and Technology Innovation Project

Science and Technology Service Network Initiative, Chinese Academy of Sciences

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/14/2/730/pdf

Reference44 articles.

1. Hodan, T., Michel, F., Brachmann, E., Kehl, W., Buch, A.G., Kraft, D., Drost, B., Vidal, J., Ihrke, S., and Zabulis, X. (2018, January 8–14). BOP: Benchmark for 6D Object Pose Estimation. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany.

2. Sundermeyer, M., Hodaň, T., Labbe, Y., Wang, G., Brachmann, E., Drost, B., Rother, C., and Matas, J. (2023, January 18–22). Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

3. Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: A review;Du;Artif. Intell. Rev.,2021

4. Huang, Y., and Chen, Y. (2020). Autonomous driving with deep learning: A survey of state-of-art technologies. arXiv.

5. Pose estimation for augmented reality: A hands-on survey;Marchand;IEEE Trans. Vis. Comput. Graph.,2015