Audio Source Separation using Sparse Representations-Reference-Cited by-同舟云学术

Audio Source Separation using Sparse Representations

Published: Issue: Volume: Page:246-265
ISSN:
Container-title:Machine Audition
language:
Short-container-title:

Author:

Nesbit Andrew¹,Jafar Maria G.¹,Vincent Emmanuel²,Plumbley Mark D.¹

Affiliation:

1. Queen Mary University of London, United Kingdom

2. INRIA, France

Abstract

The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research.

Publisher

IGI Global

Reference31 articles.

1. Abdallah, S. A., & Plumbley, M. D. (2004). Application of geometric dependency analysis to the separation of convolved mixtures. In Proceedings of the International Conference on Independent Component Analysis and Signal Separation (pp. 22-24).

2. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representations.;M.Aharon;IEEE Transactions on Signal Processing,2006

3. Benaroya, L., Bimbot, F., Gravier, G., & Gribonval, R. (2003). Audio source separation with one sensor for robust speech recognition. In NOLISP-2003.(paper 030).

4. Bertin, N., Badeau, R., & Vincent, E. (2009). Enforcing harmonicity and smoothness in Bayesian non-negative matrix factorization applied to polyphonic music transcription. (Report No. 2009D006). Paris, France: Telecom ParisTech

5. Underdetermined blind source separation using sparse representations.;P.Bofill;Signal Processing,2001

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An Experimental Evaluation of Wiener Filter Smoothing Techniques Applied to Under-Determined Audio Source Separation;Latent Variable Analysis and Signal Separation;2010