1. An image is worth 16x16 words: Transformers for image recognition at scale;dosovitskiy;International Conference on Learning Representations,2021
2. Attention is all you need;vaswani;Advances in neural information processing systems,2017
3. BERT: Pre-training of deep bidirectional transformers for language understanding;devlin;Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Volume 1 (Long and Short Papers),2019
4. Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization
5. Deep Residual Learning for Image Recognition