1. Masked Siamese Networks for Label-Efficient Learning
2. Sit: Self-supervised vision transformer;Atito,2021
3. Multimae: Multi-modal multi-task masked autoencoders. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022;Bachmann
4. Data2vec: A general framework for self-supervised learning in speech, vision and language;Baevski,2022
5. TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers