Performance Analysis of Deep Learning Model-Compression Techniques for Audio Classification on Edge Devices-Reference-Cited by-同舟云学术

Performance Analysis of Deep Learning Model-Compression Techniques for Audio Classification on Edge Devices

Published:2024-04-02 Issue:2 Volume:6 Page:21
ISSN:2413-4155
Container-title:Sci
language:en
Short-container-title:Sci

Author:

Mou Afsana¹^ORCID,Milanova Mariofanna¹^ORCID

Affiliation:

1. Department of Computer Science, University of Arkansas, Little Rock, AR 72204, USA

Abstract

Audio classification using deep learning models, which is essential for applications like voice assistants and music analysis, faces challenges when deployed on edge devices due to their limited computational resources and memory. Achieving a balance between performance, efficiency, and accuracy is a significant obstacle to optimizing these models for such constrained environments. In this investigation, we evaluate diverse deep learning architectures, including Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM), for audio classification tasks on the ESC 50, UrbanSound8k, and Audio Set datasets. Our empirical findings indicate that Mel spectrograms outperform raw audio data, attributing this enhancement to their synergistic alignment with advanced image classification algorithms and their congruence with human auditory perception. To address the constraints of model size, we apply model-compression techniques, notably magnitude pruning, Taylor pruning, and 8-bit quantization. The research demonstrates that a hybrid pruned model achieves a commendable accuracy rate of 89 percent, which, although marginally lower than the 92 percent accuracy of the uncompressed CNN, strikingly illustrates an equilibrium between efficiency and performance. Subsequently, we deploy the optimized model on the Raspberry Pi 4 and NVIDIA Jetson Nano platforms for audio classification tasks. These findings highlight the significant potential of model-compression strategies in enabling effective deep learning applications on resource-limited devices, with minimal compromise on accuracy.

Funder

NSF I-Corps 21552- National Innovation and the National Science Foundation

Publisher

MDPI AG

Link

https://www.mdpi.com/2413-4155/6/2/21/pdf

Reference36 articles.

1. der Mauer, M.A., Behrens, T., Derakhshanmanesh, M., Hansen, C., and Muderack, S. (2019). Digitalization Cases: How Organizations Rethink Their Business for the Digital Age, Springer.

2. Development of internal sound sensor using stethoscope and its applications for machine monitoring;Yun;Procedia Manuf.,2020

3. An overview of applications and advancements in automatic sound recognition;Sharan;Neurocomputing,2016

4. A multi-view CNN-based acoustic classification system for automatic animal species identification;Xu;Ad. Hoc. Netw.,2020

5. Automatic acoustic identification of individuals in multiple species: Improving identification across recording conditions;Stowell;J. R. Soc. Interface,2019

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. RATTLE: Train Identification Through Audio Fingerprinting;2024 IEEE International Conference on Smart Computing (SMARTCOMP);2024-06-29