TIAR: Text-Image-Audio Retrieval with weighted multimodal re-ranking-Reference-Cited by-同舟云学术

TIAR: Text-Image-Audio Retrieval with weighted multimodal re-ranking

Published:2023-07-04 Issue:19 Volume:53 Page:22898-22916
ISSN:0924-669X
Container-title:Applied Intelligence
language:en
Short-container-title:Appl Intell

Author:

Chi Peide,Feng Yong^ORCID,Zhou Mingliang,Xiong Xian-cai,Wang Yong-heng,Qiang Bao-hua

Funder

National Nature Science Foundation of China

Zhejiang Lab

Open Fund of Key Laboratory of Monitoring, Evaluation and Early Warning of Territorial Spatial Planning Implementation, Ministry of Natural Resources

Key Laboratory in Science and Technology Development Project of Suzhou

Guangxi Key Laboratory of Trusted Software

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence

Link

https://link.springer.com/content/pdf/10.1007/s10489-023-04669-3.pdf

Reference68 articles.

1. Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2018) Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6077–6086

2. Baevski A, Zhou Y, Mohamed A, Auli M (2020) wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in Neural Information Processing Systems 33:12449–12460

3. Brock A, De S, Smith SL, Simonyan K (2021) High-performance large-scale image recognition without normalization. In: International Conference on Machine Learning, PMLR, pp 1059–1071

4. Chen H, Ding G, Liu X, Lin Z, Liu J, Han J (2020a) Imram: Iterative matching with recurrent attention memory for cross-modal image-text retrieval. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12655–12663

5. Chen L, Ren J, Chen P, Mao X, Zhao Q (2022) Limited text speech synthesis with electroglottograph based on bi-lstm and modified tacotron-2. Applied Intelligence 52(13):15193–15209

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Semantic deep learning and adaptive clustering for handling multimodal multimedia information retrieval;Multimedia Tools and Applications;2024-05-25

2. Binding Text, Images, Graphs, and Audio for Music Representation Learning;Proceedings of the Cognitive Models and Artificial Intelligence Conference;2024-05-25