Publisher
Springer Nature Singapore
Reference37 articles.
1. Aafaq, N., Akhtar, N., Liu, W., Gilani, S.Z., Mian, A.: Spatio-temporal dynamics and semantic attribute enriched visual encoding for video captioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
2. Aafaq, N., Mian, A., Liu, W., Akhtar, N., Shah, M.: Cross-domain modality fusion for dense video captioning. IEEE Transactions on Artificial Intelligence 1(01), 1–1 (2021)
3. Aafaq, N., Mian, A., Liu, W., Gilani, S.Z., Shah, M.: Video description: A survey of methods, datasets, and evaluation metrics. ACM Computing Surveys (CSUR) 52(6), 1–37 (2019)
4. Aafaq, N., Mian, A.S., Akhtar, N., Liu, W., Shah, M.: Dense video captioning with early linguistic information fusion. IEEE Transactions on Multimedia (2022)
5. Bakirman, T., Sertel, E.: Hrplanes: High resolution airplane dataset for deep learning. arXiv preprint arXiv:2204.10959 (2022)