1. Dong An , Yuankai Qi , Yan Huang , Qi Wu , Liang Wang , and Tieniu Tan . 2021 . Neighbor-view Enhanced Model for Vision and Language Navigation. In ACM Int. Conf. Multimedia. Dong An, Yuankai Qi, Yan Huang, Qi Wu, Liang Wang, and Tieniu Tan. 2021. Neighbor-view Enhanced Model for Vision and Language Navigation. In ACM Int. Conf. Multimedia.
2. Dong An , Yuankai Qi , Yangguang Li , Yan Huang , Liang Wang , Tieniu Tan , and Jing Shao . 2022. BEVBert: Topo-Metric Map Pre-training for Language-guided Navigation. arXiv preprint arXiv:2212.04385 ( 2022 ). Dong An, Yuankai Qi, Yangguang Li, Yan Huang, Liang Wang, Tieniu Tan, and Jing Shao. 2022. BEVBert: Topo-Metric Map Pre-training for Language-guided Navigation. arXiv preprint arXiv:2212.04385 (2022).
3. Peter Anderson , Qi Wu , Damien Teney , Jake Bruce , Mark Johnson , Niko Sü nderhauf, Ian D. Reid , Stephen Gould , and Anton van den Hengel. 2018 . Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments. In IEEE Conf. Comput. Vis. Pattern Recog. 3674--3683 . Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko Sü nderhauf, Ian D. Reid, Stephen Gould, and Anton van den Hengel. 2018. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments. In IEEE Conf. Comput. Vis. Pattern Recog. 3674--3683.
4. End-to-End Object Detection with Transformers
5. Matterport3D: Learning from RGB-D Data in Indoor Environments