1. TALL: temporal activity localization via language query;Gao,2017
2. Localizing moments in video with natural language;Anne Hendricks,2017
3. TVR: a large-scale dataset for video-subtitle moment retrieval;Lei,2020
4. Attention is all you need;Vaswani,2017
5. Self-attention generative adversarial networks;Zhang,2019