1. Bampis, C. G., Li, Z., Moorthy, A. K., Katsavounidis, I., Aaron, A., & Bovik, A. C. (2017). Study of temporal effects on subjective video quality of experience. IEEE Transactions on Image Processing, 26(11), 5217–5231.
2. Barron, J. T. (2019). A general and adaptive robust loss function. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 4331–4339.
3. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
4. Choi, L. K., & Bovik, A. C. (2018). Video quality assessment accounting for temporal visual masking of local flicker. Signal Processing: Image Communication, 67, 182–198.
5. Deng, J., Dong, W., Socher, R., Li, LJ., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 248–255.