1. Auto-encoding scene graphs for image captioning;X Yang;Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2019
2. Star-transformer: a spatio-temporal cross attention transformer for human action recognition;D Ahn;Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV),2023
3. Learning to compose dynamic tree structures for visual contexts;K Tang;Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2019
4. Neural motifs: Scene graph parsing with global context;R Zellers;Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2018