1. Show and tell: a neural image caption generator;Vinyals,2015
2. Deep visual-semantic alignments for generating image descriptions;Karpathy,2015
3. Deep captioning with multimodal recurrent neural networks (m-rnn);Mao,2015
4. Cross-modal retrieval with correspondence autoencoder;Feng,2014
5. A unified framework for multimodal retrieval;Rafailidis;Pattern Recognit.,2013