1. End-to-End Object Detection with Transformers
2. Localizing Visual Sounds the Hard Way
3. Liang-Chieh Chen , George Papandreou , Iasonas Kokkinos , Kevin Murphy , and Alan L Yuille . 2017 . Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs . IEEE transactions on pattern analysis and machine intelligence 40, 4 (2017), 834--848. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2017. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40, 4 (2017), 834--848.
4. Transformer Tracking
5. Bowen Cheng , Alexander G. Schwing , and Alexander Kirillov . 2021 . Per-Pixel Classification is Not All You Need for Semantic Segmentation . In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021 , NeurIPS 2021, December 6-14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 17864--17875. https://proceedings.neurips.cc/paper/2021/ hash/950a4152c2b4aa3ad78bdd6b366cc179-Abstract.html Bowen Cheng, Alexander G. Schwing, and Alexander Kirillov. 2021. Per-Pixel Classification is Not All You Need for Semantic Segmentation. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 17864--17875. https://proceedings.neurips.cc/paper/2021/ hash/950a4152c2b4aa3ad78bdd6b366cc179-Abstract.html