1. CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding
2. Hangbo Bao , Li Dong , Songhao Piao , and Furu Wei . 2022 . BEiT: BERT Pre-Training of Image Transformers. In International Conference on Learning Representations. Hangbo Bao, Li Dong, Songhao Piao, and Furu Wei. 2022. BEiT: BERT Pre-Training of Image Transformers. In International Conference on Learning Representations.
3. Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell etal 2020. Language models are few-shot learners. Advances in neural information processing systems Vol. 33 (2020) 1877--1901. Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems Vol. 33 (2020) 1877--1901.
4. End-to-End Object Detection with Transformers
5. Angel X Chang , Thomas Funkhouser , Leonidas Guibas , Pat Hanrahan , Qixing Huang , Zimo Li , Silvio Savarese , Manolis Savva , Shuran Song , Hao Su , 2015 . Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015). Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).