Dual Projective Zero-Shot Learning Using Text Descriptions-Reference-Cited by-同舟云学术

Dual Projective Zero-Shot Learning Using Text Descriptions

Published:2023-01-05 Issue:1 Volume:19 Page:1-17
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Rao Yunbo¹^ORCID,Yang Ziqiang¹^ORCID,Zeng Shaoning¹^ORCID,Wang Qifeng²^ORCID,Pu Jiansu¹^ORCID

Affiliation:

1. University of Electronic Science and Technology of China, Chengdu, Sichuan, China

2. Google Berkeley, Berkeley, California, USA

Abstract

Zero-shot learning (ZSL) aims to recognize image instances of unseen classes solely based on the semantic descriptions of the unseen classes. In this field, Generalized Zero-Shot Learning (GZSL) is a challenging problem in which the images of both seen and unseen classes are mixed in the testing phase of learning. Existing methods formulate GZSL as a semantic-visual correspondence problem and apply generative models such as Generative Adversarial Networks and Variational Autoencoders to solve the problem. However, these methods suffer from the bias problem since the images of unseen classes are often misclassified into seen classes. In this work, a novel model named the Dual Projective model for Zero-Shot Learning (DPZSL) is proposed using text descriptions. In order to alleviate the bias problem, we leverage two autoencoders to project the visual and semantic features into a latent space and evaluate the embeddings by a visual-semantic correspondence loss function. An additional novel classifier is also introduced to ensure the discriminability of the embedded features. Our method focuses on a more challenging inductive ZSL setting in which only the labeled data from seen classes are used in the training phase. The experimental results, obtained from two popular datasets—Caltech-UCSD Birds-200-2011 (CUB) and North America Birds (NAB)—show that the proposed DPZSL model significantly outperforms both the inductive ZSL and GZSL settings. Particularly in the GZSL setting, our model yields an improvement up to 15.2% in comparison with state-of-the-art CANZSL on datasets CUB and NAB with two splittings.

Funder

Science and Technology Project of Sichuan

National Natural Science Foundation of China

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3514247

Reference52 articles.

1. Zeynep Akata, Mateusz Malinowski, Mario Fritz, and Bernt Schiele. 2016. Multi-cue zero-shot learning with strong supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Las Vegas, Nevada, USA, 59–68.

2. Label-embedding for image classification;Akata Zeynep;IEEE Transactions on Pattern Analysis and Machine Intelligence,2015

3. Evaluation of output embeddings for fine-grained image classification

4. Yashas Annadani and Soma Biswas. 2018. Preserving semantic relations for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, Utah, USA, 7603–7612.

5. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Cross-domain zero-shot learning for enhanced fault diagnosis in high-voltage circuit breakers;Neural Networks;2024-12

2. A novel mechanical fault diagnosis for high-voltage circuit breakers with zero-shot learning;Expert Systems with Applications;2024-07

3. An intelligent compound fault diagnosis method using generalized zero-shot model of bearing;Measurement Science and Technology;2024-06-28

4. A comprehensive review on zero-shot-learning techniques;Intelligent Decision Technologies;2024-06-07

5. A comprehensive review on zero-shot-learning techniques;Intelligent Decision Technologies;2024-04-17