Progress and Limitations of Deep Networks to Recognize Objects in Unusual Poses-Reference-Cited by-同舟云学术

Progress and Limitations of Deep Networks to Recognize Objects in Unusual Poses

Published:2023-06-26 Issue:1 Volume:37 Page:160-168
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Abbas Amro,Deny Stéphane

Abstract

Deep networks should be robust to rare events if they are to be successfully deployed in high-stakes real-world applications. Here we study the capability of deep networks to recognize objects in unusual poses. We create a synthetic dataset of images of objects in unusual orientations, and evaluate the robustness of a collection of 38 recent and competitive deep networks for image classification. We show that classifying these images is still a challenge for all networks tested, with an average accuracy drop of 29.5% compared to when the objects are presented upright. This brittleness is largely unaffected by various design choices, such as training losses, architectures, dataset modalities, and data-augmentation schemes. However, networks trained on very large datasets substantially outperform others, with the best network tested—Noisy Student trained on JFT-300M—showing a relatively small accuracy drop of only 14.5% on unusual poses. Nevertheless, a visual inspection of the failures of Noisy Student reveals a remaining gap in robustness with humans. Furthermore, combining multiple object transformations—3D-rotations and scaling—further degrades the performance of all networks. Our results provide another measurement of the robustness of deep networks to consider when using them in the real world. Code and datasets are available at https://github.com/amro-kamal/ObjectPose.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploring Prompting Approaches in Legal Textual Entailment;The Review of Socionetwork Strategies;2024-01-23

2. Are Deep Neural Networks Adequate Behavioral Models of Human Visual Perception?;Annual Review of Vision Science;2023-09-15

3. Medial temporal cortex supports compositional visual inferences;2023-09-08