An Unpaired Thermal Infrared Image Translation Method Using GMA-CycleGAN

Author:

Yang Shihao1,Sun Min1,Lou Xiayin1,Yang Hanjun1,Zhou Hang1ORCID

Affiliation:

1. Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China

Abstract

Automatically translating chromaticity-free thermal infrared (TIR) images into realistic color visible (CV) images is of great significance for autonomous vehicles, emergency rescue, robot navigation, nighttime video surveillance, and many other fields. Most recent designs use end-to-end neural networks to translate TIR directly to CV; however, compared to these networks, TIR has low contrast and an unclear texture for CV translation. Thus, directly translating the TIR temperature value of only one channel to the RGB color value of three channels without adding additional constraints or semantic information does not handle the one-to-three mapping problem between different domains in a good way, causing the translated CV images not only to have blurred edges but also color confusion. As for the methodology of the work, considering that in the translation from TIR to CV the most important process is to map information from the temperature domain into the color domain, an improved CycleGAN (GMA-CycleGAN) is proposed in this work in order to translate TIR images to grayscale visible (GV) images. Although the two domains have different properties, the numerical mapping is one-to-one, which reduces the color confusion caused by one-to-three mapping when translating TIR to CV. Then, a GV-CV translation network is applied to obtain CV images. Since the process of decomposing GV images into CV images is carried out in the same domain, edge blurring can be avoided. To enhance the boundary gradient between the object (pedestrian and vehicle) and the background, a mask attention module based on the TIR temperature mask and the CV semantic mask is designed without increasing the network parameters, and it is added to the feature encoding and decoding convolution layers of the CycleGAN generator. Moreover, a perceptual loss term is applied to the original CycleGAN loss function to bring the translated images closer to the real images regarding the space feature. In order to verify the effectiveness of the proposed method, the FLIR dataset is used for experiments, and the obtained results show that, compared to the state-of-the-art model, the subjective quality of the translated CV images obtained by the proposed method is better, as the objective evaluation metric FID (Fréchet inception distance) is reduced by 2.42 and the PSNR (peak signal-to-noise ratio) is improved by 1.43.

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3