MusicNeXt: Addressing category bias in fused music using musical features and genre-sensitive adjustment layer-Reference-Cited by-同舟云学术

MusicNeXt: Addressing category bias in fused music using musical features and genre-sensitive adjustment layer

Published:2023-11-16 Issue: Volume: Page:1-15
ISSN:1088-467X
Container-title:Intelligent Data Analysis
language:
Short-container-title:IDA

Author:

Meng Shiting,Hao Qingbo,Xiao Yingyuan,Zheng Wenguang

Abstract

Convolutional neural networks (CNNs) have been successfully applied to music genre classification tasks. With the development of diverse music, genre fusion has become common. Fused music exhibits multiple similar musical features such as rhythm, timbre, and structure, which typically arise from the temporal information in the spectrum. However, traditional CNNs cannot effectively capture temporal information, leading to difficulties in distinguishing fused music. To address this issue, this study proposes a CNN model called MusicNeXt for music genre classification. Its goal is to enhance the feature extraction method to increase focus on musical features, and increase the distinctiveness between different genres, thereby reducing classification result bias. Specifically, we construct the feature extraction module which can fully utilize temporal information, thereby enhancing its focus on music features. It exhibits an improved understanding of the complexity of fused music. Additionally, we introduce a genre-sensitive adjustment layer that strengthens the learning of differences between different genres through within-class angle constraints. This leads to increased distinctiveness between genres and provides interpretability for the classification results. Experimental results demonstrate that our proposed MusicNeXt model outperforms baseline networks and other state-of-the-art methods in music genre classification tasks, without generating category bias in the classification results.

Publisher

IOS Press

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Theoretical Computer Science

Reference28 articles.

1. Pattern analysis based acoustic signal processing: A survey of the state-of-art;Chaki;Int. J. Speech Technol,2021

2. An evaluation of Convolutional Neural Networks for music classification using spectrograms;Costa;Appl. Soft Comput,2017

3. Z. Liu, H. Mao, C. Wu, C. Feichtenhofer, T. Darrell and S. Xie, A ConvNet for the 2020s, in: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022, IEEE, 2022, pp. 11966–11976.

4. K. Choi, G. Fazekas, M.B. Sandler and K. Cho, Convolutional recurrent neural networks for music classification, in: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5–9, 2017, IEEE, 2017, pp. 2392–2396.

5. X. Zhang, J. Qian, Y. Yu, Y. Sun and W. Li, Singer Identification Using Deep Timbre Feature Learning with KNN-NET, in: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6–11, 2021, IEEE, 2021, pp. 3380–3384.