Publisher
Springer Nature Singapore
Reference42 articles.
1. Botach, A., Zheltonozhskii, E., Baskin, C.: End-to-end referring video object segmentation with multimodal transformers. In: Proceedings of the IEEE/CVF CVPR (2022)
2. Caesar, H., Uijlings, J., Ferrari, V.: Coco-stuff: thing and stuff classes in context. In: Proceedings of the IEEE/CVF CVPR, pp. 1209–1218 (2018)
3. Chen, Z., et al.: Vision transformer adapter for dense predictions. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=plKu2GByCNW
4. Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., Girdhar, R.: Masked-attention mask transformer for universal image segmentation. In: Proceedings of the IEEE/CVF CVPR. pp, 1290–1299 (2022)
5. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE/CVF CVPR, pp. 3213–3223 (2016)