Probabilistic and semantic descriptions of image manifolds and their applications
-
Published:2023-11-02
Issue:
Volume:5
Page:
-
ISSN:2624-9898
-
Container-title:Frontiers in Computer Science
-
language:
-
Short-container-title:Front. Comput. Sci.
Author:
Tu Peter,Yang Zhaoyuan,Hartley Richard,Xu Zhiwei,Zhang Jing,Fu Yiwei,Campbell Dylan,Singh Jaskirat,Wang Tianyu
Abstract
This paper begins with a description of methods for estimating probability density functions for images that reflects the observation that such data is usually constrained to lie in restricted regions of the high-dimensional image space—not every pattern of pixels is an image. It is common to say that images lie on a lower-dimensional manifold in the high-dimensional space. However, although images may lie on such lower-dimensional manifolds, it is not the case that all points on the manifold have an equal probability of being images. Images are unevenly distributed on the manifold, and our task is to devise ways to model this distribution as a probability distribution. In pursuing this goal, we consider generative models that are popular in AI and computer vision community. For our purposes, generative/probabilistic models should have the properties of (1) sample generation: it should be possible to sample from this distribution according to the modeled density function, and (2) probability computation: given a previously unseen sample from the dataset of interest, one should be able to compute the probability of the sample, at least up to a normalizing constant. To this end, we investigate the use of methods such as normalizing flow and diffusion models. We then show how semantic interpretations are used to describe points on the manifold. To achieve this, we consider an emergent language framework that makes use of variational encoders to produce a disentangled representation of points that reside on a given manifold. Trajectories between points on a manifold can then be described in terms of evolving semantic descriptions. In addition to describing the manifold in terms of density and semantic disentanglement, we also show that such probabilistic descriptions (bounded) can be used to improve semantic consistency by constructing defenses against adversarial attacks. We evaluate our methods on CelebA and point samples for likelihood estimation with improved semantic robustness and out-of-distribution detection capability, MNIST and CelebA for semantic disentanglement with explainable and editable semantic interpolation, and CelebA and Fashion-MNIST to defend against patch attacks with significantly improved classification accuracy. We also discuss the limitations of applying our likelihood estimation to 2D images in diffusion models.
Funder
Defense Advanced Research Projects Agency
Publisher
Frontiers Media SA
Subject
Computer Science Applications,Computer Vision and Pattern Recognition,Human-Computer Interaction,Computer Science (miscellaneous)
Reference59 articles.
1. Adversarial patch
BrownT. B.
ManéD.
RoyA.
AbadiM.
GilmerJ.
Conference on Neural Information Processing Systems (NeurIPS)2017
2. Towards evaluating the robustness of neural networks;Carlini;CoRR abs/1608.04644,2016
3. Compositionality and generalization in emergent languages;Chaabouni;arXiv,2020
4. Maximum likelihood features for generative image models;Chang;Ann. Appl. Stat,2017