1. Spice: Semantic propositional image caption evaluation;Anderson,2016
2. Bottom-up and top-down attention for image captioning and visual question answering;Anderson,2018
3. METEOR: AN automatic metric for MT evaluation with improved correlation with human judgments;Banerjee,2005
4. Regularizing RNNs for caption generation by reconstructing the past with the present;Chen,2018
5. SCA-CNN: SPatial and channel-wise attention in convolutional networks for image captioning;Chen,2017