Abstract
Coded aperture compressive temporal imaging (CACTI) utilizes compressive sensing (CS)
theory to compress three dimensional (3D) signals into 2D measurements
for sampling in a single snapshot measurement, which in turn acquires
high-dimensional (HD) visual signals. To solve the problems of low
quality and slow runtime often encountered in reconstruction, deep
learning has become the mainstream for signal reconstruction and has
shown superior performance. Currently, however, impressive networks
are typically supervised networks with large-sized models and require
vast training sets that can be difficult to obtain or expensive. This
limits their application in real optical imaging systems. In this
paper, we propose a lightweight reconstruction network that recovers
HD signals only from compressed measurements with noise and design a
block consisting of convolution to extract and fuse local and global
features, stacking multiple features to form a lightweight
architecture. In addition, we also obtain unsupervised loss functions
based on the geometric characteristics of the signal to guarantee the
powerful generalization capability of the network in order to
approximate the reconstruction process of real optical systems.
Experimental results show that our proposed network significantly
reduces the model size and not only has high performance in recovering
dynamic scenes, but the unsupervised video reconstruction network can
approximate its supervised version in terms of reconstruction
performance.
Funder
National Natural Science Foundation of China
Beijing Jiaotong University