1. Akan, A.K., Erdem, E., Erdem, A., Guney, F., 2021. SLAMP: Stochastic Latent Appearance and Motion Prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. ICCV.
2. Arnab, A., Dehghani, M., Heigold, G., Sun, C., Lučić, M., Schmid, C., 2021. Vivit: A video vision transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 6836–6846.
3. Babaeizadeh, M., Finn, C., Erhan, D., Campbell, R.H., Levine, S., 2018. Stochastic Variational Video Prediction. In: International Conference on Learning Representations.
4. FitVid: Overfitting in pixel-level video prediction;Babaeizadeh,2021
5. Bei, X., Yang, Y., Soatto, S., 2021. Learning semantic-aware dynamics for video prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.