Affiliation:
1. K. J. Somaiya Institute of Management Studies and Research, India
2. Annamalai University, India
Abstract
The proposed work investigates the impact of Mel Frequency Cepstral Coefficients (MFCC), Chroma DCT Reduced Pitch (CRP), and Chroma Energy Normalized Statistics (CENS) for instrument recognition from monophonic instrumental music clips using deep learning techniques, Bidirectional Recurrent Neural Networks with Long Short-Term Memory (BRNN-LSTM), stacked autoencoders (SAE), and Convolutional Neural Network - Long Short-Term Memory (CNN-LSTM). Initially, MFCC, CENS, and CRP features are extracted from instrumental music clips collected as a dataset from various online libraries. In this work, the deep neural network models have been fabricated by training with extracted features. Recognition rates of 94.9%, 96.8%, and 88.6% are achieved using combined MFCC and CENS features, and 90.9%, 92.2%, and 87.5% are achieved using combined MFCC and CRP features with deep learning models BRNN-LSTM, CNN-LSTM, and SAE, respectively. The experimental results evidence that MFCC features combined with CENS and CRP features at score level revamp the efficacy of the proposed system.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献