Affiliation:
1. University of Illinois at Urbana-Champaign
Abstract
Machine learning (ML) has revolutionized a wide range of recognition tasks, ranging from text analysis to speech to vision, most notably in cloud deployments. However, mobile deployment of these ideas involves a very different category of design problems. In this article, we develop a hardware architecture for a sound source separation task, intended for deployment on a mobile phone. We focus on a novel Markov random field (MRF) sound source separation algorithm that uses expectation-maximization and Gibbs sampling to learn MRF parameters on the fly and infer the best separation of sources. The intrinsically iterative algorithm suggests challenges for both speed and power. A real-time streaming FPGA implementation runs at 150MHz with 207KB RAM, achieves a speed-up of 22× over a software reference, performs with an SDR of up to 7.021dB with 1.601ms latency, and exhibits excellent perceived audio quality. A 45nm CMOS ASIC virtual prototype simulated at 20MHz shows that this architecture is small (<10 million gates) and consumes only 70mW, which is less than 2% of the power of an ARM Cortex-A9 software version. To the best of our knowledge, this is the first Gibbs sampling inference accelerator designed in conventional FPGA/ASIC technology that targets a realistic mobile perceptual application.
Funder
MARCO
DARPA
six SRC STARnet Centers
Systems on Nanoscale Information fabriCs
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Hardware and Architecture,Software
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Efficient underdetermined speech signal separation using encompassed Hammersley- Clifford algorithm and hardware implementation;Microprocessors and Microsystems;2021-09
2. Accelerating Bayesian Inference on Structured Graphs Using Parallel Gibbs Sampling;2019 29th International Conference on Field Programmable Logic and Applications (FPL);2019-09
3. FlexGibbs: Reconfigurable Parallel Gibbs Sampling Accelerator for Structured Graphs;2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM);2019-04
4. Demystifying Bayesian Inference Workloads;2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS);2019-03