1. Abbasi, A., Kalkan, S., Sahillioǧlu, Y.: Deep 3D semantic scene extrapolation. Vis. Comput. 35, 271–279 (2019)
2. Anderson, P., Wu, Q., Teney, D., Bruce, J., Johnson, M., Sünderhauf, N., Reid, I., Gould, S., Hengel, A.: Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments (2017). arXiv:1711.07280
3. Anderson, P., Chang, A., Chaplot, D., Dosovitskiy, A., Gupta, S., Koltun, V., Kosecka, J., Malik, J., Mottaghi, R., Savva, M., Zamir, A.: On Evaluation of Embodied Navigation Agents (2018). arXiv:1807.06757
4. Arada Hudson, D.A., Zitnick, L.: Compositional transformers for scene generation. Adv. Neural Inf. Process. Syst. 34, 9506–9520 (2021)
5. Argaw, D.M., Kim, J., Rameau, F., Kweon, I.S.: Motion-blurred video interpolation and extrapolation [Internet] (2021) [cited 1 Dec 2022]. arXiv:2103.02984