Gradient-Based Competitive Learning: Theory-Reference-Cited by-同舟云学术

Gradient-Based Competitive Learning: Theory

Published:2023-11-23 Issue:2 Volume:16 Page:608-623
ISSN:1866-9956
Container-title:Cognitive Computation
language:en
Short-container-title:Cogn Comput

Author:

Cirrincione Giansalvo,Randazzo Vincenzo^ORCID,Barbiero Pietro,Ciravegna Gabriele,Pasero Eros

Abstract

AbstractDeep learning has been recently used to extract the relevant features for representing input data also in the unsupervised setting. However, state-of-the-art techniques focus mostly on algorithmic efficiency and accuracy rather than mimicking the input manifold. On the contrary, competitive learning is a powerful tool for replicating the input distribution topology. It is cognitive/biologically inspired as it is founded on Hebbian learning, a neuropsychological theory claiming that neurons can increase their specialization by competing for the right to respond to/represent a subset of the input data. This paper introduces a novel perspective by combining these two techniques: unsupervised gradient-based and competitive learning. The theory is based on the intuition that neural networks can learn topological structures by working directly on the transpose of the input matrix. At this purpose, the vanilla competitive layer and its dual are presented. The former is representative of a standard competitive layer for deep clustering, while the latter is trained on the transposed matrix. The equivalence of the layers is extensively proven both theoretically and experimentally. The dual competitive layer has better properties. Unlike the vanilla layer, it directly outputs the prototypes of the data inputs, while still allowing learning by backpropagation. More importantly, this paper proves theoretically that the dual layer is better suited for handling high-dimensional data (e.g., for biological applications), because the estimation of the weights is driven by a constraining subspace which does not depend on the input dimensionality, but only on the dataset cardinality. This paper has introduced a novel approach for unsupervised gradient-based competitive learning. This approach is very promising both in the case of small datasets of high-dimensional data and for better exploiting the advantages of a deep architecture: the dual layer perfectly integrates with the deep layers. A theoretical justification is also given by using the analysis of the gradient flow for both vanilla and dual layers.

Funder

Politecnico di Torino

Publisher

Springer Science and Business Media LLC

Subject

Cognitive Neuroscience,Computer Science Applications,Computer Vision and Pattern Recognition

Link

https://link.springer.com/content/pdf/10.1007/s12559-023-10225-5.pdf

Reference67 articles.

1. MacQueen J, others. Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Oakland, CA, USA. 1967;281–97.

2. McLachlan GJ, Basford KE. Mixture models: inference and applications to clustering. M. Dekker New York. 1988.

3. Martinetz T, Schulten K, others. A “neural-gas” network learns topologies. Artif Neural Netw. 1991;397–402.

4. Bhatia SK, others. Adaptive K-means clustering. FLAIRS conference. 2004;695–9.

5. Ester M, Kriegel H-P, Sander J, Xu X, others. A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd. 1996;226–31.