Self-attention guided representation learning for image-text matching-Reference-Cited by-同舟云学术

Self-attention guided representation learning for image-text matching

Published:2021-08 Issue: Volume:450 Page:143-155
ISSN:0925-2312
Container-title:Neurocomputing
language:en
Short-container-title:Neurocomputing

Author:

Qi Xuefei,Zhang Ying,Qi Jinqing,Lu Huchuan

Funder

National Natural Science Foundation of China

Publisher

Elsevier BV

Subject

Artificial Intelligence,Cognitive Neuroscience,Computer Science Applications

Reference74 articles.

1. Stacked cross attention for image-text matching;Lee;ECCV,2018

2. VSE++: improving visual-semantic embeddings with hard negatives;Faghri;BMVC,2018

3. Linking image and text with 2-way nets;Eisenschtat;CVPR,2017

4. K. Xu, J. Ba, R. Kiros, K. Cho, A. C. Courville, R. Salakhutdinov, R. S. Zemel, Y. Bengio, Show, attend and tell: Neural image caption generation with visual attention, in: ICML, 2015, pp. 2048–2057.

5. O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge, PAMI (2017) 652–663.

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An end-to-end image-text matching approach considering semantic uncertainty;Neurocomputing;2024-11

2. Unsupervised anomaly detection of nuclear power plants under noise background based on convolutional adversarial autoencoder combining self-attention mechanism;Nuclear Engineering and Design;2024-11

3. Self-supervised modal optimization transformer for image captioning;Neural Computing and Applications;2024-08-09

4. 3SHNet: Boosting image–sentence retrieval via visual semantic–spatial self-highlighting;Information Processing & Management;2024-07

5. Negative-Sensitive Framework With Semantic Enhancement for Composed Image Retrieval;IEEE Transactions on Multimedia;2024