An efficient parallel kernel based on Cholesky decomposition to accelerate Multichannel Non-Negative Matrix Factorization-Reference-Cited by-同舟云学术

An efficient parallel kernel based on Cholesky decomposition to accelerate Multichannel Non-Negative Matrix Factorization

Published:2022-10-17 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Muñoz-Montoro Antonio J.¹,Carabias-Orti Julio J.²,Salvati Daniele³,Cortina Raquel¹

Affiliation:

1. University of Oviedo

2. University of Jaen

3. University of Udine

Abstract

Abstract Multichannel Source Separation has been a popular topic, and recently proposed methods based on the local Gaussian model (LGM) have provided promising result despite its high computational cost when several sensors are used. The main reason being due to inversion of a spatial covariance matrix, with a complexity of \(O(I^3)\), being \(I\) the number of sensors. This drawback limits the practical application of this approach for tasks such as sound field reconstruction or virtual reality, among others. In this paper, we present a numerical approach to reduce the complexity of the Multichannel NMF to address the task of audio source separation for scenarios with a high number of sensors such as High Order Ambisonics (HOA) encoding. In particular, we propose a parallel multi-architecture driver to compute the multiplicative update rules in MNMF approaches. The proposed driver has been designed to work on both sequential and multi-core computers, as well as Graphics Processing Units (GPUs) and Intel Xeon coprocessors. The proposed software was written in C language and can be called from numerical computing environments. The proposed solution tries to reduce the computational cost of the multiplicative update rules by using the Cholesky decomposition and by solving several triangular equation systems.The proposal has been evaluated for different scenarios with promising results in terms of execution times for both CPU and GPU. To the best of our knowledge, our proposal is the first system that addresses the problem of reducing the computational cost of full-rank MNMF-based systems using parallel and high performance techniques.

Publisher

Research Square Platform LLC

Reference449 articles.

1. Frigo, Matteo and Johnson, S.G. (2005) {The Design and Implementation of FFTW3}. Proceedings of the IEEE 93(2): 216--231 https://doi.org/10.1109/JPROC.2004.840301, http://ieeexplore.ieee.org/document/1386650/, feb, HeartRate,Computaci{\'{o}}n, Adaptive software,Cosine transform,Fast Fourier transform (FFT),Fourier transform,Hartley transform,I/O tensor, 0018-9219, :Users/jmontoro/Library/Application Support/Mendeley Desktop/Downloaded/Frigo, Johnson - 2005 - The design and implementation of FFTW3(2).pdf:pdf, FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with hand-optimized libraries, and describes the software structure that makes our current FFTW3 version flexible and adaptive. We further discuss a new algorithm for real-data DFTs of prime size, a new way of implementing DFTs by means of machine-specific single-instruction, multiple-data (SIMD) instructions, and how a special-purpose compiler can derive optimized implementations of the discrete cosine and sine transforms automatically from a DFT algorithm. {\textcopyright} 2005 IEEE.

2. Ozerov, Alexey and F{\'e}votte, C{\'e}dric and Vincent, Emmanuel (2018) An Introduction to Multichannel NMF for Audio Source Separation. Springer International Publishing, Cham, https://doi.org/10.1007/978-3-319-73031-8_4, 10.1007/978-3-319-73031-8_4, 978-3-319-73031-8, 73--94, Audio Source Separation, Makino, Shoji

3. Wien, Mathias and Boyce, Jill M. and Stockhammer, Thomas and Peng, Wen-Hsiao (2019) Standardization Status of Immersive Video Coding. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9(1): 5-17 https://doi.org/10.1109/JETCAS.2019.2898948

4. Brown, Judith C. and Puckette, Miller S. (1992) { An efficient algorithm for the calculation of a constant Q transform }. The Journal of the Acoustical Society of America https://doi.org/10.1121/1.404385, 0001-4966

5. Sekiguchi, Kouhei and Bando, Yoshiaki and Nugraha, Aditya Arie and Fontaine, Mathieu and Yoshii, Kazuyoshi (2021) Autoregressive Fast Multichannel Nonnegative Matrix Factorization For Joint Blind Source Separation And Dereverberation. 10.1109/ICASSP39728.2021.9414857, 511-515, , , ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)