Affiliation:
1. Department of Electronics Engineering, Sejong University, Seoul 05006, Republic of Korea
2. Department of Computer Engineering, Sejong University, Seoul 05006, Republic of Korea
Abstract
As a special type of transformer, vision transformers (ViTs) can be used for various computer vision (CV) applications. Convolutional neural networks (CNNs) have several potential problems that can be resolved with ViTs. For image coding tasks such as compression, super-resolution, segmentation, and denoising, different variants of ViTs are used. In our survey, we determined the many CV applications to which ViTs are applicable. CV applications reviewed included image classification, object detection, image segmentation, image compression, image super-resolution, image denoising, anomaly detection, and drone imagery. We reviewed the state of the-art and compiled a list of available models and discussed the pros and cons of each model.
Subject
Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering
Reference175 articles.
1. Heo, B., Yun, S., Han, D., Chun, S., Choe, J., and Oh, S.J. (2021). Proceedings of the IEEE/CVF International Conference on Computer Vision, IEEE.
2. Tenney, I., Das, D., and Pavlick, E. (2019). BERT rediscovers the classical NLP pipeline. arXiv.
3. GPT-3: Its nature, scope, limits, and consequences;Floridi;Minds Mach.,2020
4. Imagenet classification with deep convolutional neural networks;Krizhevsky;Commun. ACM,2017
5. Jamil, S., Rahman, M., Ullah, A., Badnava, S., Forsat, M., and Mirjavadi, S.S. (2020). Malicious UAV detection using integrated audio and visual features for public safety applications. Sensors, 20.
Cited by
21 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献