Abstract
AbstractTo investigate the relationship of perception and imagery, we model the visual ventral stream with an encoder and decoder part with the help of capsule networks. The proposed network consists of V1 and V2 from CorNet-Z, as well as the Capsule Network architecture with the routing by agreement algorithm for V4 and IT. The decoder reverses this architecture to model the feedback activation patterns of the visual ventral stream. The model was trained using EMNIST (letters H, S, C, T). Resulting classification performance was high with good generalization performance to different sizes, positions, and rotations. Contextual information was used for occluded stimuli in the feedback path for reconstructions resulting in high classification performance. Additionally, a pre-trained network was used to reconstruct remapped fMRI activation patterns from higher visual areas. Reconstructions of single-trial imagery data showed significant correlations to physical letter stimuli. The fMRI activation patterns of V1 and V2 and their reconstructions with population receptive field mapping and an autoencoder were related to activation patterns of the network to test biological plausibility. Representational Similarity Analysis and spatial correlations indicated an overlap of information content between the capsule network and the fMRI activations. Due to the capsule networks’ high generalization performance and the implemented feedback connections, the proposed network is a promising approach to improve current modelling efforts of perception and imagery. Further research is needed to compare the presented network to established networks that model the visual ventral stream.
Publisher
Cold Spring Harbor Laboratory
Reference55 articles.
1. Deep learning with asymmetric connections and hebbian updates;Frontiers in computational neuroscience,2019
2. Explainable artificial intelligence: an analytical review;Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery,2021
3. Local features and global shape information in object classification by deep convolutional neural networks;Vision research,2020
4. Beliy, R. , Gaziv, G. , Hoogi, A. , Strappini, F. , Golan, T. , & Irani, M . (2019). From voxels to pixels and back: Self-supervision in natural-image reconstruction from fmri. Advances in Neural Information Processing Systems, 32.
5. View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex