Blind Source Separation in Polyphonic Music Recordings Using Deep Neural Networks Trained via Policy Gradients-Reference-Cited by-同舟云学术

Blind Source Separation in Polyphonic Music Recordings Using Deep Neural Networks Trained via Policy Gradients

Published:2021-10-07 Issue:4 Volume:2 Page:637-661
ISSN:2624-6120
Container-title:Signals
language:en
Short-container-title:Signals

Author:

Schulze Sören^ORCID,Leuschner Johannes^ORCID,King Emily J.^ORCID

Abstract

We propose a method for the blind separation of sounds of musical instruments in audio signals. We describe the individual tones via a parametric model, training a dictionary to capture the relative amplitudes of the harmonics. The model parameters are predicted via a U-Net, which is a type of deep neural network. The network is trained without ground truth information, based on the difference between the model prediction and the individual time frames of the short-time Fourier transform. Since some of the model parameters do not yield a useful backpropagation gradient, we model them stochastically and employ the policy gradient instead. To provide phase information and account for inaccuracies in the dictionary-based representation, we also let the network output a direct prediction, which we then use to resynthesize the audio signals for the individual instruments. Due to the flexibility of the neural network, inharmonicity can be incorporated seamlessly and no preprocessing of the input spectra is required. Our algorithm yields high-quality separation results with particularly low interference on a variety of different audio samples, both acoustic and synthetic, provided that the sample contains enough data for the training and that the spectral characteristics of the musical instruments are sufficiently stable to be approximated by the dictionary.

Funder

Deutsche Forschungsgemeinschaft

Publisher

MDPI AG

Link

https://www.mdpi.com/2624-6120/2/4/39/pdf

Reference42 articles.

1. U-Net: Convolutional Networks for Biomedical Image Segmentation

2. Reinforcement Learning;Sutton,2018

3. Audio Source Separation and Speech Enhancement,2018

4. Audio Source Separation,2018

5. Source Separation and Machine Learning;Chien,2018

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Determined Reverberant Blind Source Separation of Audio Mixing Signals;Intelligent Automation & Soft Computing;2023

2. The use of polyphony in trumpet playing: developing creativity in Chinese trumpet students engaged in improvisations;Interactive Learning Environments;2022-09-14

3. On Audio Enhancement via Online Non-Negative Matrix Factorization;2022 56th Annual Conference on Information Sciences and Systems (CISS);2022-03-09