OHO: A Multi-Modal, Multi-Purpose Dataset for Human-Robot Object Hand-Over-Reference-Cited by-同舟云学术

OHO: A Multi-Modal, Multi-Purpose Dataset for Human-Robot Object Hand-Over

Published:2023-09-11 Issue:18 Volume:23 Page:7807
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Stephan Benedict¹^ORCID,Köhler Mona¹,Müller Steffen¹,Zhang Yan²^ORCID,Gross Horst-Michael¹,Notni Gunther²³^ORCID

Affiliation:

1. Neuroinformatics and Cognitive Robotics Lab, Technische Universität Ilmenau, 98693 Ilmenau, Germany

2. Group for Quality Assurance and Industrial Image Processing, Technische Universität Ilmenau, 98693 Ilmenau, Germany

3. Fraunhofer Institute for Applied Optics and Precision Engineering, IOF Jena, 07745 Jena, Germany

Abstract

In the context of collaborative robotics, handing over hand-held objects to a robot is a safety-critical task. Therefore, a robust distinction between human hands and presented objects in image data is essential to avoid contact with robotic grippers. To be able to develop machine learning methods for solving this problem, we created the OHO (Object Hand-Over) dataset of tools and other everyday objects being held by human hands. Our dataset consists of color, depth, and thermal images with the addition of pose and shape information about the objects in a real-world scenario. Although the focus of this paper is on instance segmentation, our dataset also enables training for different tasks such as 3D pose estimation or shape estimation of objects. For the instance segmentation task, we present a pipeline for automated label generation in point clouds, as well as image data. Through baseline experiments, we show that these labels are suitable for training an instance segmentation to distinguish hands from objects on a per-pixel basis. Moreover, we present qualitative results for applying our trained model in a real-world application.

Funder

Free State of Thuringia of the European Social Fund

Carl Zeiss Foundation

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/18/7807/pdf

Reference28 articles.

1. He, K., Gkioxari, G., Dollár, P., and Girshick, R. (2017, January 22–29). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.

2. Kirillov, A., Wu, Y., He, K., and Girshick, R. (2020, January 13–19). PointRend: Image Segmentation as Rendering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.

3. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021, January 4). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Proceedings of the International Conference on Learning Representations—ICLR 2021, Vienna, Austria.

4. Seichter, D., Langer, P., Wengefeld, T., Lewandowski, B., Hoechemer, D., and Gross, H.M. (2022, January 23–27). Efficient and Robust Semantic Mapping for Indoor Environments. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA.

5. Qi, C.R., Su, H., Mo, K., and Guibas, L.J. (2017, January 21–26). Pointnet: Deep learning on point sets for 3d classification and segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Detection of Novel Objects without Fine-Tuning in Assembly Scenarios by Class-Agnostic Object Detection and Object Re-Identification;Automation;2024-08-19