The Curious Layperson: Fine-Grained Image Recognition Without Expert Labels

Author:

Choudhury SubhabrataORCID,Laina Iro,Rupprecht Christian,Vedaldi Andrea

Abstract

AbstractMost of us are not experts in specific fields, such as ornithology. Nonetheless, we do have general image and language understanding capabilities that we use to match what we see to expert resources. This allows us to expand our knowledge and perform novel tasks without ad-hoc external supervision. On the contrary, machines have a much harder time consulting expert-curated knowledge bases unless trained specifically with that knowledge in mind. Thus, in this paper we consider a new problem: fine-grained image recognition without expert annotations, which we address by leveraging the vast knowledge available in web encyclopedias. First, we learn a model to describe the visual appearance of objects using non-expert image descriptions. We then train a fine-grained textual similarity model that matches image descriptions with documents on a sentence-level basis. We evaluate the method on two datasets (CUB-200 and Oxford-102 Flowers) and compare with several strong baselines and the state of the art in cross-modal retrieval. Code is available at: https://github.com/subhc/clever.

Funder

Facebook

European Research Council

Engineering and Physical Sciences Research Council

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Reference102 articles.

1. Agirre, E., Cer, D., Diab, M., & Gonzalez-Agirre, A. (2012). SemEval-2012 task 6: A pilot on semantic textual similarity. In SEM 2012, pp. 385–393.

2. Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A., & Guo, W. (2013). SEM 2013 shared task: Semantic textual similarity. In SEM, 2013, pp. 32–43.

3. Akata, Z., Perronnin, F., Harchaoui, Z., & Schmid, C. (2015). Label-embedding for image classification. TPAMI, 38(7), 1425–1438.

4. Andrew, G., Arora, R., Bilmes, J., & Livescu, K. (2013) Deep canonical correlation analysis. In ICML, pp. 1247–1255 . PMLR.

5. Asano, Y.M., Rupprecht, C., &Vedaldi, A. (2020). Self-labelling via simultaneous clustering and representation learning. In ICLR.

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Pattern-Expandable Image Copy Detection;International Journal of Computer Vision;2024-06-22

2. Tomato ripeness detection based on image recognition;International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2024);2024-06-13

3. Waffling around for Performance: Visual Classification with Random Words and Broad Concepts;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3