Abstract
Understanding the brain’s perception algorithm is a highly intricate problem, as the inherent complexity of sensory inputs and the brain’s nonlinear processing make characterizing sensory representations difficult. Recent studies have shown that functional models—capable of predicting large-scale neuronal activity in response to arbitrary sensory input—can be powerful tools for characterizing neuronal representations by enabling high-throughputin silicoexperiments. However, accurately modeling responses to dynamic and ecologically relevant inputs like videos remains challenging, particularly when generalizing to new stimulus domains outside the training distribution. Inspired by recent breakthroughs in artificial intelligence, where foundation models—trained on vast quantities of data— have demonstrated remarkable capabilities and generalization, we developed a “foundation model” of the mouse visual cortex: a deep neural network trained on large amounts of neuronal responses to ecological videos from multiple visual cortical areas and mice. The model accurately predicted neuronal responses not only to natural videos but also to various new stimulus domains, such as coherent moving dots and noise patterns, underscoring its generalization abilities. The foundation model could also be adapted to new mice with minimal natural movie training data. We applied the foundation model to the MICrONS dataset: a study of the brain that integrates structure with function at unprecedented scale, containing nanometer-scale morphology, connectivity with >500,000,000 synapses, and function of >70,000 neurons within a ∼ 1mm3volume spanning multiple areas of the mouse visual cortex. This accurate functional model of the MICrONS data opens the possibility for a systematic characterization of the relationship between circuit structure and function. By precisely capturing the response properties of the visual cortex and generalizing to new stimulus domains and mice, foundation models can pave the way for a deeper understanding of visual computation.
Publisher
Cold Spring Harbor Laboratory
Reference69 articles.
1. Spatiotemporal energy models for the perception of motion
2. J. Antolík , S. B. Hofer , J. A. Bednar , and T. D. Mrsic-flogel . Model constrained by visual hierarchy improves prediction of neural responses to natural scenes. PLoS Comput. Biol., pages 1–22, 2016.
3. The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning;Advances in Neural Information Processing Systems,2021
4. M. Bashiri , E. Walker , K.-K. Lurz , A. Jagadish , T. Muhammad , Z. Ding , Z. Ding , A. Tolias , and F. Sinz . A flow-based latent state generative model of neural population responses to natural images. Advances in Neural Information Processing Systems, 34, 2021.
5. Neural population control via deep image synthesis