Author:
Shingchern D. You ,Kai-Rong Lin ,Chien-Hung Liu
Abstract
This paper proposes an approach called block scaling quality (BSQ) for estimating the prediction accuracy of a deep network model. The basic operation perturbs the input spectrogram by multiplying all values within a block by , where is equal to 0 in the experiments. The ratio of perturbed spectrograms that have different prediction labels than the original spectrogram to the total number of perturbed spectrograms indicates how much of the spectrogram is crucial for the prediction. Thus, this ratio is inversely correlated with the accuracy of the dataset. The BSQ approach demonstrates satisfactory estimation accuracy in experiments when compared with various other approaches. When using only the Jamendo and FMA datasets, the estimation accuracy experiences an average error of 4.9% and 1.8%, respectively. Moreover, the BSQ approach holds advantages over some of the comparison counterparts. Overall, it presents a promising approach for estimating the accuracy of a deep network model.
Publisher
Taiwan Association of Engineering and Technology Innovation
Subject
Electrical and Electronic Engineering,Mechanical Engineering,Mechanics of Materials,Civil and Structural Engineering
Reference27 articles.
1. S. D. You, C. H. Liu, and W. K. Chen, “Comparative Study of Singing Voice Detection Based on Deep Neural Networks and Ensemble Learning,” Human-Centric Computing and Information Sciences, vol. 8, no. 1, article no. 34, December 2018.
2. S. D. You, C. H. Liu, and J. W. Lin, “Improvement of Vocal Detection Accuracy Using Convolutional Neural Networks,” KSII Transactions on Internet and Information Systems, vol. 15, no. 2, pp. 729-748, February 2021.
3. M. Ramona, G. Richard, and B. David, “Vocal Detection in Music with Support Vector Machines,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1885-1888, March-April 2008.
4. M. Defferrard, K. Benzi, P. Vandergheynst, and X. Bresson, “FMA: A Dataset for Music Analysis,” https://doi.org/10.48550/arXiv.1612.01840, September 2017.
5. X. Zhang, Y. Yu, Y. Gao, X. Chen, and W. Li, “Research on Singing Voice Detection Based on a Long-Term Recurrent Convolutional Network with Vocal Separation and Temporal Smoothing,” Electronics, vol. 9, no. 9, article no. 1458, September 2020.