1. SlowFast network for continuous sign language recognition;Ahn,2023
2. Collaborative three-stream transformers for video captioning;Anon;Comput. Vis. Image Underst.,2023
3. Global–local contrastive multiview representation learning for skeleton-based action recognition;Anon;Comput. Vis. Image Underst.,2023
4. Cai, Y., Ge, L., Liu, J., Cai, J., Cham, T.-J., Yuan, J., Thalmann, N.M., 2019. Exploiting spatial-temporal relationships for 3d pose estimation via graph convolutional networks. In: Proc. IEEE Int. Conf. Comput. Vis.. pp. 2272–2281.
5. HTNet: Human topology aware network for 3d human pose estimation;Cai,2023