Abstract
Humans can spontaneously detect complex algebraic structures. Historically, two opposing views explain this ability, at the root of language and music acquisition. Some argue for the existence of an innate and specific mechanism, like “merge” (Chomsky) or “neural recursion” (Dehaene). Others argue that this ability emerges from experience (e.g. Bates): i.e. when generic learning principles continuously process sensory inputs. These two views, however, remain difficult to test experimentally. Here, we use deep learning models to evaluate the factors that lead to the spontaneous detection of algebraic structures in the auditory modality. Specifically, we train multiple deep learning models with a variable amount of natural sounds and a self-supervised learning objective. We then expose these models to the experimental paradigms classically used to evaluate the processing of algebraic structures. Like humans, these models spontaneously detect repeated sequences, probabilistic chunks and complex algebraic structures. Also like humans, this ability diminishes with structure complexity. Importantly, this ability can emerge from experience alone: the more the models are exposed to natural sounds, the more they spontaneously detect increasingly complex structures. Finally, this ability does not emerge in models pretrained only on speech, and emerges more rapidly in models pretrained with music than environmental sounds. Overall, our study provides an operational framework to clarify sufficient built-in and acquired principles that model human’s advanced capacity to detect algebraic structures in sounds.Significance StatementExperimentalists have repeatedly observed a human advantage in the detection of algebraic structures, notably through auditory paradigms. This ability to detect structure is thought to be key to the emergence of complex cognitive operations. Yet, it remains debated if this ability is discovered or innate in the form of a specific mechanism. In this article, the authors show how a model progressively and spontaneously learns to detect auditory structure. The model replicate several experimental findings but only under certain developmental conditions. Notably, exposition to music or environmental sounds, but not speech, is sufficient for the emergence of algebraic structure detection. As a result, this work proposes self-supervised learning as a developmental model of abstract cognitive abilities.
Publisher
Cold Spring Harbor Laboratory