Robust and efficient airplane cockpit video coding leveraging temporal redundancy-Reference-Cited by-同舟云学术

Robust and efficient airplane cockpit video coding leveraging temporal redundancy

Published:2024-04-04 Issue: Volume: Page:
ISSN:1573-7721
Container-title:Multimedia Tools and Applications
language:en
Short-container-title:Multimed Tools Appl

Author:

Mitrica Iulia,Fiandrotti Attilio^ORCID,Ruellan Christophe,Cagnazzo Marco

Abstract

AbstractAirplane cockpit screens consist of virtual instruments where characters, numbers, and graphics are overlaid on a black or natural background. Recording the cockpit screen allows one to log vital plane data, as aircraft manufacturers do not offer direct access to raw data. However, traditional video codecs struggle at preserving character readability at the required low bit-rates. We showed in a previous work that large rate-distortion gains can be achieved if the characters are encoded as text rather than as pixels. We now leverage temporal redundancy to both achieve robust character recognition and improve encoding efficiency. A convolutional neural network is trained for character classification over synthetic samples augmented with occlusions to gain robustness against overlapping graphics. Further robustness to background occlusions is brought by a probabilistic framework that error-corrects the output of the convolutional neural network. Next, we propose a predictive text coding technique specifically tailored for text in cockpit videos that achieves competitive performance over commodity lossless methods. Experiments with real cockpit video footage show large rate-distortion gains for the proposed method with respect to three different video compression standards. Notably, the H.264/AVC codec retrofitted with our method outperforms H.265/HEVC-SCC and is competitive with the much more complex H.266/VVC while preserving text and graphics. The entire pipeline described in this work has been implemented at Safran Electronics as an embedded avionics system drawing just 2W of power thanks to a combination of software and FPGA implementation.

Funder

Università degli Studi di Torino

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11042-024-18755-2.pdf

Reference29 articles.

1. Bjontegaard G (2001) Calculation of average PSNR differences between RD-curves. In: VCEG Meeting, Austin, USA

2. Bross B, Wang YK, Ye Y et al (2021) Overview of the versatile video coding (VVC) standard and its applications. IEEE Trans Circuits Syst Video Technol

3. Burrows M, Wheeler D (1994) A block-sorting lossless data compression algorithm. In: Digital SRC research report, Citeseer

4. Cagnazzo M, Parrilli S, Poggi G et al (2007) Costs and advantages of object-based image coding with shape-adaptive wavelet transform. EURASIP J Image Video Process 2007 78323:13. https://doi.org/10.1155/2007/78323

5. De Queiroz RL, Fan Z, Tran TD (2000) Optimizing block-thresholding segmentation for multilayer compression of compound images. IEEE Trans Image Process 9(9):1461–1471