An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation-Reference-Cited by-同舟云学术

An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation

Published:2023-10-31 Issue:3 Volume:36 Page:1261-1281
ISSN:0941-0643
Container-title:Neural Computing and Applications
language:en
Short-container-title:Neural Comput & Applic

Author:

Choate Jeffrey^ORCID,Worth Derek^ORCID,Nykl Scott^ORCID,Taylor Clark^ORCID,Borghetti Brett^ORCID,Schubert Kabban Christine^ORCID

Abstract

AbstractAchieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of scale can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1

$$^\circ$$

∘ and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC.

Funder

Air Force Research Laboratory

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Software

Link

https://link.springer.com/content/pdf/10.1007/s00521-023-09094-8.pdf

Reference59 articles.

1. Anderson James D, Nykl Scott, Wischgoll Thomas (2019) Augmenting flight imagery from aerial refueling. In: Advances in Visual Computing: 14th International Symposium on Visual Computing, ISVC 2019, Lake Tahoe, NV, USA, October 7–9, 2019, Proceedings, Part II 14, pp 154–165. Springer

2. Anderson James D, Raettig Ryan M, Larson Josh, Nykl Scott L, Taylor Clark N, Wischgoll Thomas (2022) Delaunay walk for fast nearest neighbor: accelerating correspondence matching for icp. Mach Vis Appl 33(2):31

3. Bello I, Fedus W, Du X, Cubuk ED, Srinivas A, Lin T-Y, Shlens J, Zoph B (2021) Revisiting resnets: improved training and scaling strategies. Adv Neural Inf Process Syst 34:22614–22627

4. Yannick B, Marcus V (2020) Efficientpose: an efficient, accurate and scalable end-to-end 6d multi object pose estimation approach. arXiv preprint arXiv:2011.04307,

5. Jeffrey C, Derek W, Scott N, Clark T, Brett B, Schubert KC (2023) Advancing training data techniques for 6d pose localization via object detection. YouTube video, 2023. Accessed on April 28, https://youtu.be/Ot9Ug7FAh3s