Author:
Shao Mingwen,Han Minggui,Meng Lingzhuang,Liu Fukang
Abstract
Abstract
Contrastive learning for Unpaired image-to-image Translation (CUT) aims to learn a mapping from source to target domain with an unpaired dataset, which combines contrastive loss to maximize the mutual information between real and generated images. However, the existing CUT-based methods exhibit unsatisfactory visual quality due to the wrong locating of objects and backgrounds, particularly where it incorrectly transforms the background to match the object pattern in layout-changing datasets. To alleviate the issue, we present Background-Focused Contrastive learning for Unpaired image-to-image Translation (BFCUT) to improve the background’s consistency between real and its generated images. Specifically, we first generate heat maps to explicitly locate the objects and backgrounds for subsequent contrastive loss and global background similarity loss. Then, the representative queries of objects and backgrounds rather than randomly sampling queries are selected for contrastive loss to promote reality of objects and maintenance of backgrounds. Meanwhile, global semantic vectors with less object information are extracted with the help of heat maps, and we further align the vectors of real images and their corresponding generated images to promote the maintenance of the backgrounds in global background similarity loss. Our BFCUT alleviates the wrong translation of backgrounds and generates more realistic images. Extensive experiments on three datasets demonstrate better quantitative results and qualitative visual effects.
Publisher
Research Square Platform LLC
Reference78 articles.
1. Chen, Xinyuan and Xu, Chang and Yang, Xiaokang and Tao, Dacheng (2018) Attention-gan for object transfiguration in wild images. [doi:10.1007/978-3-030-012161-8\_11], 164--180, Proceedings of the European Conference on Computer Vision (ECCV)
2. Tang, Hao and Liu, Hong and Xu, Dan and Torr, Philip HS and Sebe, Nicu (2021) Attentiongan: Unpaired image-to-image translation using attention-guided generative adversarial networks. IEEE Transactions on Neural Networks and Learning Systems IEEE
3. Hu, Xueqi and Zhou, Xinyue and Huang, Qiusheng and Shi, Zhengyi and Sun, Li and Li, Qingli (2022) Qs-attn: Query-selected attention for contrastive learning in i2i translation. 18291--18300, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
4. Ko, Minsu and Cha, Eunju and Suh, Sungjoo and Lee, Huijin and Han, Jae-Joon and Shin, Jinwoo and Han, Bohyung (2022) Self-supervised dense consistency regularization for image-to-image translation. 18301--18310, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
5. Li, Minjun and Huang, Haozhi and Ma, Lin and Liu, Wei and Zhang, Tong and Jiang, Yugang (2018) Unsupervised image-to-image translation with stacked cycle-consistent adversarial networks. 184--199, Proceedings of the European Conference on Computer Vision (ECCV)