Video Unsupervised Domain Adaptation with Deep Learning: A Comprehensive Survey

Author:

Xu Yuecong1ORCID,Cao Haozhi2ORCID,Xie Lihua3ORCID,Li Xiao-Li4ORCID,Chen Zhenghua4ORCID,Yang Jianfei5ORCID

Affiliation:

1. Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore

2. School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore

3. School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore Singapore

4. Institute for Infocomm Research, A*STAR, Singapore Singapore

5. School of Mechanical and Aerospace Engineering, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore Singapore

Abstract

Video analysis tasks such as action recognition have received increasing research interest with growing applications in fields such as smart healthcare, thanks to the introduction of large-scale datasets and deep learning-based representations. However, video models trained on existing datasets suffer from significant performance degradation when deployed directly to real-world applications due to domain shifts between the training public video datasets (source video domains) and real-world videos (target video domains). Further, with the high cost of video annotation, it is more practical to use unlabeled videos for training. To tackle performance degradation and address concerns in high video annotation cost uniformly, the video unsupervised domain adaptation (VUDA) is introduced to adapt video models from the labeled source domain to the unlabeled target domain by alleviating video domain shift, improving the generalizability and portability of video models. This paper surveys recent progress in VUDA with deep learning. We begin with the motivation of VUDA, followed by its definition, and recent progress of methods for both closed-set VUDA and VUDA under different scenarios, and current benchmark datasets for VUDA research. Eventually, future directions are provided to promote further VUDA research. The repository of this survey is provided at https://github.com/xuyu0010/awesome-video-domain-adaptation.

Publisher

Association for Computing Machinery (ACM)

Reference223 articles.

1. Eric Arazo, Diego Ortego, Paul Albert, Noel E O’Connor, and Kevin McGuinness. 2020. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, IEEE, Piscataway, 1–8.

2. Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lučić, and Cordelia Schmid. 2021. Vivit: A video vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, Piscataway, 6836–6846.

3. Mustafa Ayazoglu, Burak Yilmaz, Mario Sznaier, and Octavia Camps. 2013. Finding causal interactions in video sequences. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, Sydney, 3575–3582.

4. Fatemeh Azimi, Sebastian Palacio, Federico Raue, Jörn Hees, Luca Bertinetto, and Andreas Dengel. 2022. Self-supervised test-time adaptation on video data. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. IEEE, Waikoloa, Hawaii, 3439–3448.

5. Transfer learning for image classification using VGG19: Caltech-101 image data set;Bansal Monika;Journal of ambient intelligence and humanized computing,2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3