1. Show, attend and tell: neural image caption generation with visual attention;Xu,2015
2. Gla: global-local attention for image description;Li;IEEE Trans. Multimedia,2017
3. Hierarchical attention network for image captioning;Wang,2019
4. Seqgan: Sequence generative adversarial nets with policy gradient;Yu,2017
5. Policy gradient methods for reinforcement learning with function approximation;Sutton,1999