Abstract
AbstractIn order to perform emotion prediction in music quickly and accurately, we have proposed a muscular and tall neural network architecture for music emotion classification. Specifically, during the audio pre-processing stage, we converge mel-scale frequency cepstral coefficients features and residual phase features with weighting, enabling the extraction of more comprehensive music emotion characteristics. Additionally, to enhance the accuracy of predicting musical emotion while reducing computational complexity during training phase, we consolidate Long short term memory network with Broad learning system network. We employ long short term memory structure as the feature mapping node of broad learning system structure, leveraging the advantages of both network models. This novel Neural Network architecture, called BLNN (Broad-Long Neural Network), achieves higher prediction accuracy. i.e., 66.78%, than single network models and other benchmark with/without consolidation methods. Moreover, it achieves lower time complexity than other excellent models, i.e., 169.32 s of training time and 507.69 ms of inference time, and achieves the optimal balance between efficiency and performance. In short, the extensive experimental results demonstrate that the proposed BLNN architecture effectively predicts music emotion, surpassing other models in terms of accuracy while reducing computational demands. In addition, the detailed description of the related work, along with an analysis of its advantages and disadvantages, and its future prospects, can serve as a valuable reference for future researchers.
Funder
key subject of Shandong Province's art and science
Shandong Province Arts Science Key Project
Publisher
Springer Science and Business Media LLC