Author:
Chen Danya,Wu Lijun,Chen Zhicong,Lin Xufeng
Publisher
Springer Nature Singapore
Reference29 articles.
1. Carion, N., Massa, F., Synnaeve, G., et al.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H. (eds.) Computer Vision – ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
2. Chen, Y., Wang, Z., Peng, Y., et al.: Cascaded pyramid network for multi-person pose estimation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7103–7112 (2018)
3. Cheng, B., Xiao, B., Wang, J., et al.: HigherHRNet: scale-aware representation learning for bottom-up human pose estimation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5385–5394 (2020)
4. Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16 $$\times $$ 16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations (ICLR), pp. 1–21 (2021)
5. Duan, H., Zhao, Y., Chen, K., et al.: Revisiting skeleton-based action recognition. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2959–2968 (2022)