DCSPose: A Dual-Channel Siamese Framework for Unseen Textureless Object Pose Estimation
-
Published:2024-01-15
Issue:2
Volume:14
Page:730
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Yue Zhen12ORCID, Han Zhenqi1ORCID, Yang Xiulong1ORCID, Liu Lizhuang1
Affiliation:
1. Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201210, China 2. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract
The demand for object pose estimation is steadily increasing, and deep learning has propelled the advancement of this field. However, the majority of research endeavors face challenges in their applicability to industrial production. This is primarily due to the high cost of annotating 3D data, which places higher demands on the generalization capabilities of neural network models. Additionally, existing methods struggle to handle the abundance of textureless objects commonly found in industrial settings. Finally, there is a strong demand for real-time processing capabilities in industrial production processes. Therefore, in this study, we introduced a dual-channel Siamese framework to address these challenges in industrial applications. The architecture employs a Siamese structure for template matching, enabling it to learn the matching capability between the templates constructed from high-fidelity simulated data and real-world scenes. This capacity satisfies the requirements for generalization to unseen objects. Building upon this, we utilized two feature extraction channels to separately process RGB and depth information, addressing the limited feature issue associated with textureless objects. Through our experiments, we demonstrated that this architecture effectively estimates the three-dimensional pose of objects, achieving a 6.0% to 10.9% improvement compared to the state-of-the-art methods, while exhibiting robust generalization and real-time processing capabilities.
Funder
Shanghai Science and Technology Innovation Project Science and Technology Service Network Initiative, Chinese Academy of Sciences
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference44 articles.
1. Hodan, T., Michel, F., Brachmann, E., Kehl, W., Buch, A.G., Kraft, D., Drost, B., Vidal, J., Ihrke, S., and Zabulis, X. (2018, January 8–14). BOP: Benchmark for 6D Object Pose Estimation. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany. 2. Sundermeyer, M., Hodaň, T., Labbe, Y., Wang, G., Brachmann, E., Drost, B., Rother, C., and Matas, J. (2023, January 18–22). Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada. 3. Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: A review;Du;Artif. Intell. Rev.,2021 4. Huang, Y., and Chen, Y. (2020). Autonomous driving with deep learning: A survey of state-of-art technologies. arXiv. 5. Pose estimation for augmented reality: A hands-on survey;Marchand;IEEE Trans. Vis. Comput. Graph.,2015
|
|