C-Reference: Improving 2D to 3D Object Pose Estimation Accuracy via Crowdsourced Joint Object Estimation-Reference-Cited by-同舟云学术

C-Reference: Improving 2D to 3D Object Pose Estimation Accuracy via Crowdsourced Joint Object Estimation

Published:2020-05-28 Issue:CSCW1 Volume:4 Page:1-28
ISSN:2573-0142
Container-title:Proceedings of the ACM on Human-Computer Interaction
language:en
Short-container-title:Proc. ACM Hum.-Comput. Interact.

Author:

Song Jean Y.¹,Chung John Joon Young¹,Fouhey David F.¹,Lasecki Walter S.¹

Affiliation:

1. University of Michigan - Ann Arbor, Ann Arbor, MI, USA

Abstract

Converting widely-available 2D images and videos, captured using an RGB camera, to 3D can help accelerate the training of machine learning systems in spatial reasoning domains ranging from in-home assistive robots to augmented reality to autonomous vehicles. However, automating this task is challenging because it requires not only accurately estimating object location and orientation, but also requires knowing currently unknown camera properties (e.g., focal length). A scalable way to combat this problem is to leverage people's spatial understanding of scenes by crowdsourcing visual annotations of 3D object properties. Unfortunately, getting people to directly estimate 3D properties reliably is difficult due to the limitations of image resolution, human motor accuracy, and people's 3D perception (i.e., humans do not "see" depth like a laser range finder). In this paper, we propose a crowd-machine hybrid approach that jointly uses crowds' approximate measurements of multiple in-scene objects to estimate the 3D state of a single target object. Our approach can generate accurate estimates of the target object by combining heterogeneous knowledge from multiple contributors regarding various different objects that share a spatial relationship with the target object. We evaluate our joint object estimation approach with 363 crowd workers and show that our method can reduce errors in the target object's 3D location estimation by over 40%, while requiring only $35$% as much human time. Our work introduces a novel way to enable groups of people with different perspectives and knowledge to achieve more accurate collective performance on challenging visual annotation tasks.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Human-Computer Interaction,Social Sciences (miscellaneous)

Link

https://dl.acm.org/doi/pdf/10.1145/3392858

Reference72 articles.

1. Sean Bell Paul Upchurch Noah Snavely and Kavita Bala. 2013. OpenSurfaces: A richly annotated catalog of surface appearance. ACM Transactions on Graphics (TOG)32 4 (2013) 111. Sean Bell Paul Upchurch Noah Snavely and Kavita Bala. 2013. OpenSurfaces: A richly annotated catalog of surface appearance. ACM Transactions on Graphics (TOG)32 4 (2013) 111.

2. Xun Cao Alan C Bovik Yao Wang and Qionghai Dai. 2011. Converting 2D video to 3D: An efficient path to a 3Dexperience.IEEE MultiMedia 18 4 (2011) 12--17. Xun Cao Alan C Bovik Yao Wang and Qionghai Dai. 2011. Converting 2D video to 3D: An efficient path to a 3Dexperience.IEEE MultiMedia 18 4 (2011) 12--17.

3. Beat the MTurkers: Automatic Image Labeling from Weak 3D Supervision

4. Learning Single-Image Depth From Videos Using Quality Assessment Networks

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Ground-truth or DAER: Selective Re-query of Secondary Information;2021 IEEE/CVF International Conference on Computer Vision (ICCV);2021-10

2. GeoCAM: An IP-Based Geolocation Service Through Fine-Grained and Stable Webcam Landmarks;IEEE/ACM Transactions on Networking;2021-08

3. A Wide Area Multiview Static Crowd Estimation System Using UAV and 3D Training Simulator;Remote Sensing;2021-07-15

4. Crowdsourcing More Effective Initializations for Single-Target Trackers Through Automatic Re-querying;Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems;2021-05-06

5. Characterising Usage Patterns and Privacy Risks of a Home Security Camera Service;IEEE Transactions on Mobile Computing;2020