Abstract
AbstractWe aim to investigate how closely neural networks (NNs) mimic human thinking. As a step in this direction, we study the behavior of artificial neuron(s) that fire most when the input data score high on some specific emergent concepts. In this paper, we focus on music, where the emergent concepts are those of rhythm, pitch and melody as commonly used by humans. As a black box to pry open, we focus on Google’s MusicVAE, a pre-trained NN that handles music tracks by encoding them in terms of 512 latent variables. We show that several hundreds of these latent variables are “irrelevant” in the sense that can be set to zero with minimal impact on the reconstruction accuracy. The remaining few dozens of latent variables can be sorted by order of relevance by comparing their variance. We show that the first few most relevant variables, and only those, correlate highly with dozens of human-defined measures that describe rhythm and pitch in music pieces, thereby efficiently encapsulating many of these human-understandable concepts in a few nonlinear variables.
Funder
Agencia Estatal de Investigación
Generalitat Valenciana
Ministerio de Ciencia e Innovación
H2020 Marie Skłodowska-Curie Actions
Consejo Superior de Investigaciones Cientificas
Publisher
Springer Science and Business Media LLC
Reference17 articles.
1. Bau D, Zhu J-Y, Strobelt H, Lapedriza A, Zhou B, Torralba A (2020) Understanding the role of individual units in a deep neural network. Proc Natl Acad Sci 117(48):30071–30078
2. Iten Raban, Metger Tony, Wilming Henrik, del Rio Lídia, Renner Renato (2020) Discovering physical concepts with neural networks. Phys Rev Lett 124:010508
3. Chen B, Huang K, Raghupathi S, Chandratreya I, Du Q, Lipson H (2021) Discovering state variables hidden in experimental data
4. Gabriela Barenboim, Johannes Hirn, Veronica Sanz (2021) Symmetry meets AI. SciPost Phys 11:014
5. Roberts A, Engel JH, Raffel C, Hawthorne C (2018) and Douglas Eck. A hierarchical latent vector model for learning long-term structure in music. CoRR, arXiv:abs/1803.05428