1. Frequency-tuned salient region detection
2. Hangbo Bao , Li Dong , Songhao Piao , and Furu Wei . 2022 . BEiT: BERT Pre-Training of Image Transformers. In International Conference on Learning Representations. https://openreview.net/forum?id=p-BhZSz59o4 Hangbo Bao, Li Dong, Songhao Piao, and Furu Wei. 2022. BEiT: BERT Pre-Training of Image Transformers. In International Conference on Learning Representations. https://openreview.net/forum?id=p-BhZSz59o4
3. Zhe Chen , Yuchen Duan , Wenhai Wang , Junjun He , Tong Lu , Jifeng Dai , and Yu Qiao . 2023 . Vision Transformer Adapter for Dense Predictions. In The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=plKu2GByCNW Zhe Chen, Yuchen Duan, Wenhai Wang, Junjun He, Tong Lu, Jifeng Dai, and Yu Qiao. 2023. Vision Transformer Adapter for Dense Predictions. In The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=plKu2GByCNW
4. Efficient Salient Region Detection with Soft Image Abstraction
5. Alexey Dosovitskiy , Lucas Beyer , Alexander Kolesnikov , Dirk Weissenborn , Xiaohua Zhai , Thomas Unterthiner , Mostafa Dehghani , Matthias Minderer , Georg Heigold , Sylvain Gelly , Jakob Uszkoreit , and Neil Houlsby . 2021 . An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale . In International Conference on Learning Representations. https://openreview.net/forum?id=YicbFdNTTy Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations. https://openreview.net/forum?id=YicbFdNTTy