Abstract
Deep networks should be robust to rare events if they are to be successfully deployed in high-stakes real-world applications. Here we study the capability of deep networks to recognize objects in unusual poses. We create a synthetic dataset of images of objects in unusual orientations, and evaluate the robustness of a collection of 38 recent and competitive deep networks for image classification. We show that classifying these images is still a challenge for all networks tested, with an average accuracy drop of 29.5% compared to when the objects are presented
upright. This brittleness is largely unaffected by various design choices, such as training losses, architectures, dataset modalities, and data-augmentation schemes. However, networks trained on very large datasets substantially outperform others, with the best network tested—Noisy Student trained on JFT-300M—showing a relatively small accuracy drop of only 14.5% on unusual poses. Nevertheless, a visual inspection of the failures of Noisy Student reveals a remaining gap in robustness with humans. Furthermore, combining multiple object transformations—3D-rotations and scaling—further degrades the performance of all networks. Our results provide another measurement of the robustness of deep networks to consider when using them in the real world. Code and datasets are available at https://github.com/amro-kamal/ObjectPose.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献