Affiliation:
1. College of Science and Technology, Ningbo University, Ningbo 315300, China
Abstract
In system processing, video inevitably suffers from distortion, which leads to quality degradation and affects the user experience. Therefore, it is of great importance to design an accurate and effective objective video quality assessment (VQA) method. In this paper, by considering the multi-dimensional characteristics for video and visual perceptual mechanism, a two-stream convolutional network for VQA is proposed based on spatial–temporal analysis, named TSCNN-VQA. Specifically, for feature extraction, TSCNN-VQA first extracts spatial and temporal features by two different convolutional neural network branches, respectively. After that, the spatial–temporal joint feature fusion is constructed to obtain the joint spatial–temporal features. Meanwhile, the TSCNN-VQA also integrates an attention module to guarantee that the process conforms to the mechanism that the visual system perceives video information. Finally, the overall quality is obtained by non-linear regression. The experimental results in both the LIVE and CSIQ VQA datasets show that the performance indicators obtained by TSCNN-VQA are higher than those of existing VQA methods, which demonstrates that TSCNN-VQA can accurately evaluate video quality and has better consistency with the human visual system.
Funder
Natural Science Foundation of Zhejiang Province
Reference31 articles.
1. Study on no-reference video quality assessment method incorporating dual deep learning networks;Li;Multimed. Tools Appl.,2023
2. Image quality assessment: Unifying structure and texture similarity;Ding;IEEE Trans. Pattern Anal. Mach. Intell.,2020
3. Blind quality assessment for tone-mapped images by analysis of gradient and chromatic statistics;Fang;IEEE Trans. Multimed.,2020
4. Ahn, S., Choi, Y., and Yoon, K. (2021, January 20–25). Deep learning-based distortion sensitivity prediction for full-reference image quality assessment. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.
5. Xu, M., Chen, J., Wang, H., Liu, S., Li, G., and Bai, Z. (2020, January 4–8). C3DVQA: Full-reference video quality assessment with 3D convolutional neural network. Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain.