3D Tensor Auto-encoder with Application to Video Compression-Reference-Cited by-同舟云学术

3D Tensor Auto-encoder with Application to Video Compression

Published:2021-06 Issue:2 Volume:17 Page:1-18
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Li Yang¹,Liu Guangcan²,Sun Yubao²,Liu Qingshan²,Chen Shengyong³

Affiliation:

1. School of IoT Engineering (School of Information Security), Jiangsu Vocational College of Information Technology, Wuxi, China

2. Nanjing University of Information Science and Technology, Nanjing, China

3. Key Laboratory of Computer Vision and System, Ministry of Education, Tianjin University of Technology, Tianjin, China

Abstract

Auto-encoder has been widely used to compress high-dimensional data such as the images and videos. However, the traditional auto-encoder network needs to store a large number of parameters. Namely, when the input data is of dimension n , the number of parameters in an auto-encoder is in general O ( n ). In this article, we introduce a network structure called 3D Tensor Auto-Encoder (3DTAE). Unlike the traditional auto-encoder, in which a video is represented as a vector, our 3DTAE considers videos as 3D tensors to directly pass tensor objects through the network. The weights of each layer are represented by three small matrices, and thus the number of parameters in 3DTAE is just O ( n 1/3). The compact nature of 3DTAE fits well the needs of video compression. Given an ensemble of high-dimensional videos, we represent them as 3DTAE networks plus some small core tensors, and we further quantize the network parameters and the core tensors to get the final compressed data. Experimental results verify the efficiency of 3DTAE.

Funder

National Natural Science Foundation of China

Higher Vocational Education Teaching Fusion Production Integration Platform Construction Projects of Jiangsu Province

Research Project of Jiangsu Vocational College of Information Technology

New Generation AI Major Project of Ministry of Science and Technology of China

High Level of Jiangsu Province Key Construction Project Fund

“Qing Lan Project” Teaching Team in Colleges and Universities of Jiangsu Province

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3431768

Reference52 articles.

1. ISO/IEC CD 23090-3 Versatile Video Coding document N10692 Joint Video Experts Team (JVET) of ITU-T SG 16 WP3 and ISO/IEC JTC 1/SC 29/WG 11. Retrieved from https://www.hhi.fraunhofer.de/. ISO/IEC CD 23090-3 Versatile Video Coding document N10692 Joint Video Experts Team (JVET) of ITU-T SG 16 WP3 and ISO/IEC JTC 1/SC 29/WG 11. Retrieved from https://www.hhi.fraunhofer.de/.

2. Image compression using JPEG with reduced blocking effects via adaptive down-sampling and self-learning image sparse representation

3. Mohammad Haris Baig Vladlen Koltun and Lorenzo Torresani. 2017. Learning to inpaint for image compression. In Advances in Neural Information Processing Systems (NIPS’17). 1246--1255. Mohammad Haris Baig Vladlen Koltun and Lorenzo Torresani. 2017. Learning to inpaint for image compression. In Advances in Neural Information Processing Systems (NIPS’17). 1246--1255.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Novel Local Binary Temporal Convolutional Neural Network for Bearing Fault Diagnosis;IEEE Transactions on Instrumentation and Measurement;2023