Author:
Ji Zhong,Lin Zhigang,Wang Haoran,Pang Yanwei,Li Xuelong
Funder
National Natural Science Foundation of China
Tianjin University
Reference53 articles.
1. Fine-grained visual-textual representation learning;He;IEEE Trans. Circuits Syst. Video Technol.,2020
2. N. Rasiwasia, J. Costa Pereira, E. Coviello, E. Doyle, L.G. R., R. Levy, N. Vasconcelos, A new approach to cross-modal multimedia retrieval, in: ACM International Conference on Multimedia, 2010, pp. 251–260.
3. Current research status and prospects on multimedia content understanding;Peng;J. Comput. Res. Dev.,2019
4. Supervised learning based discrete hashing for image retrieval;Ma;Pattern Recognit.,2019
5. A. Karpathy, F.-F. Li, Deep visual-semantic alignments for generating image descriptions, in: IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3128–3137.