1. Deep vit features as dense visual descriptors;Amir;ECCV Workshop on What is Motion For?,2022
2. RT-1: Robotics Transformer for Real-World Control at Scale
3. Emerging Properties in Self-Supervised Vision Transformers
4. A simple framework for contrastive learning of visual representations;Chen
5. Vision transformer adapter for dense predictions;Chen