1. Microsoft coco: Common objects in context;lin;European Conference on Computer Vision,2014
2. Tracked-Vehicle Retrieval by Natural Language Descriptions With Domain Adaptive Knowledge
3. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks;lu;Neural Information Processing Systems,2019
4. Decoupled weight decay regularization;loshchilov;International Conference on Learning Representations,2017
5. Momentum contrast for unsupervised visual representation learning;he;2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2019