LFGAN-Reference-Cited by-同舟云学术

LFGAN

Published:2020-04-02 Issue:1 Volume:16 Page:1-20
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Chen Bin¹,Ruan Lingyan¹,Lam Miu-Ling¹

Affiliation:

1. City University of Hong Kong, Kowloon, Hong Kong SAR, China

Abstract

We present a deep neural network called the light field generative adversarial network (LFGAN) that synthesizes a 4D light field from a single 2D RGB image. We generate light fields using a single image super-resolution (SISR) technique based on two important observations. First, the small baseline gives rise to the high similarity between the full light field image and each sub-aperture view. Second, the occlusion edge at any spatial coordinate of a sub-aperture view has the same orientation as the occlusion edge at the corresponding angular patch, implying that the occlusion information in the angular domain can be inferred from the sub-aperture local information. We employ the Wasserstein GAN with gradient penalty (WGAN-GP) to learn the color and geometry information from the light field datasets. The network can generate a plausible 4D light field comprising 8×8 angular views from a single sub-aperture 2D image. We propose new loss terms, namely epipolar plane image (EPI) and brightness regularization (BRI) losses, as well as a novel multi-stage training framework to feed the loss terms at different time to generate superior light fields. The EPI loss can reinforce the network to learn the geometric features of the light fields, and the BRI loss can preserve the brightness consistency across different sub-aperture views. Two datasets have been used to evaluate our method: in addition to an existing light field dataset capturing scenes of flowers and plants, we have built a large dataset of toy animals consisting of 2,100 light fields captured with a plenoptic camera. We have performed comprehensive ablation studies to evaluate the effects of individual loss terms and the multi-stage training strategy, and have compared LFGAN to other state-of-the-art techniques. Qualitative and quantitative evaluation demonstrates that LFGAN can effectively estimate complex occlusions and geometry in challenging scenes, and outperform other existing techniques.

Funder

City University of Hong Kong

Research Grants Council

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3366371

Reference54 articles.

1. The Light Field Camera: Extended Depth of Field, Aliasing, and Superresolution

2. Light field superresolution

3. Epipolar-plane image analysis: An approach to determining structure from motion

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction;2024 IEEE International Conference on Computational Photography (ICCP);2024-07-22

2. Suitable and Style-Consistent Multi-Texture Recommendation for Cartoon Illustrations;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-05-16

3. LFSphereNet: Real Time Spherical Light Field Reconstruction from a Single Omnidirectional Image;Proceedings of the 20th ACM SIGGRAPH European Conference on Visual Media Production;2023-11-30

4. Light Field Synthesis from a Monocular Image using Variable LDI;2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW);2023-06

5. Novel View Synthesis from a Single Unposed Image via Unsupervised Learning;ACM Transactions on Multimedia Computing, Communications, and Applications;2023-05-31