Global-Affine and Local-Specific Generative Adversarial Network for semantic-guided image generation-Reference-Cited by-同舟云学术

Global-Affine and Local-Specific Generative Adversarial Network for semantic-guided image generation

Published:2021 Issue:3 Volume:4 Page:145
ISSN:2577-8838
Container-title:Mathematical Foundations of Computing
language:
Short-container-title:MFC

Author:

Zhang Susu,Ni Jiancheng,Hou Lijun,Zhou Zili,Hou Jie,Gao Feng

Abstract

<p style='text-indent:20px;'>The recent progress in learning image feature representations has opened the way for tasks such as label-to-image or text-to-image synthesis. However, one particular challenge widely observed in existing methods is the difficulty of synthesizing fine-grained textures and small-scale instances. In this paper, we propose a novel Global-Affine and Local-Specific Generative Adversarial Network (GALS-GAN) to explicitly construct global semantic layouts and learn distinct instance-level features. To achieve this, we adopt the graph convolutional network to calculate the instance locations and spatial relationships from scene graphs, which allows our model to obtain the high-fidelity semantic layouts. Also, a local-specific generator, where we introduce the feature filtering mechanism to separately learn semantic maps for different categories, is utilized to disentangle and generate specific visual features. Moreover, we especially apply a weight map predictor to better combine the global and local pathways considering the highly complementary between these two generation sub-networks. Extensive experiments on the COCO-Stuff and Visual Genome datasets demonstrate the superior generation performance of our model against previous methods, our approach is more capable of capturing photo-realistic local characteristics and rendering small-sized entities with more details.</p>

Publisher

American Institute of Mathematical Sciences (AIMS)

Subject

Artificial Intelligence,Computational Mathematics,Computational Theory and Mathematics,Theoretical Computer Science

Reference38 articles.

1. H. Caesar, J. Uijlings and V. Ferrari, COCO-Stuff: Thing and stuff classes in context, IEEE Conference on Computer Vision and Pattern Recognition, (2018), 1209–1218.

2. W. L. Chen and J. Hays, Sketchygan: Towards diverse and realistic sketch to image synthesis, IEEE Conference on Computer Vision and Pattern Recognition, (2018), 9416–9425.

3. B. Chen, T. Liu, K. Liu, H. Liu and S. Pei, Image Super-Resolution Using Complex Dense Block on Generative Adversarial Networks, IEEE International Conference on Image Processing, (2019), 2866–2870.

4. Y. Choi, M. Choi, M. Kim, J. M. Ha, S. H. Kim and J. Choo, Stargan: Unified generative adversarial networks for multi-domain image-to-image translation, IEEE Conference on Computer Vision and Pattern Recognition, (2018), 8789–8797.

5. Y. Choi, Y. Uh, J. Yoo and J. W. Ha, StarGAN v2: Diverse image synthesis for multiple domains, IEEE Conference on Computer Vision and Pattern Recognition, (2020), 8185–8194.