1. Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts
2. Improved Visual-Semantic Alignment for Zero-Shot Object Detection
3. ViL-BERT: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks;lu;Advances in Neu-ral Information Processing Systems (NeurIPS),2019
4. Microsoft COCO: Common objects in context;lin;ECCV,0
5. Oscar: Object-semantics aligned pre-training for vision-language tasks;li;ECCV,2020