Combining Acoustic and Multilevel Visual Features for Music Genre Classification-Reference-Cited by-同舟云学术

Combining Acoustic and Multilevel Visual Features for Music Genre Classification

Published:2015-08-24 Issue:1 Volume:12 Page:1-17
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Wu Ming-Ju¹,Jang Jyh-Shing R.²

Affiliation:

1. National Taiwan University, Hsinchu, Taiwan

2. National Taiwan University, Taipei, Taiwan

Abstract

Most music genre classification approaches extract acoustic features from frames to capture timbre information, leading to the common framework of bag-of-frames analysis. However, time-frequency analysis is also vital for modeling music genres. This article proposes multilevel visual features for extracting spectrogram textures and their temporal variations. A confidence-based late fusion is proposed for combining the acoustic and visual features. The experimental results indicated that the proposed method achieved an accuracy improvement of approximately 14% and 2% in the world's largest benchmark dataset (MASD) and Unique dataset, respectively. In particular, the proposed approach won the Music Information Retrieval Evaluation eXchange (MIREX) music genre classification contests from 2011 to 2013, demonstrating the feasibility and necessity of combining acoustic and visual features for classifying music genres.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/2801127

Reference44 articles.

1. Time-Frequency Analysis of Musical Instruments

2. Support vector machines using GMM supervectors for speaker verification

3. Chuan Cao and Ming Li. 2009. Thinkits submission for MIREX 2009 audio music classification and similarity tasks. http://www.music-ir.org/mirex/results/2009/abs/CL.pdf. Chuan Cao and Ming Li. 2009. Thinkits submission for MIREX 2009 audio music classification and similarity tasks. http://www.music-ir.org/mirex/results/2009/abs/CL.pdf.

Cited by 21 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Classification and study of music genres with multimodal Spectro-Lyrical Embeddings for Music (SLEM);Multimedia Tools and Applications;2024-04-24

2. Time-frequency visual representation and texture features for audio applications: a comprehensive review, recent trends, and challenges;Multimedia Tools and Applications;2023-03-16

3. PMG-Net: Persian music genre classification using deep neural networks;Entertainment Computing;2023-01

4. Stacked auto-encoders based visual features for speech/music classification;Expert Systems with Applications;2022-12

5. The Classification of Music and Art Genres under the Visual Threshold of Deep Learning;Computational Intelligence and Neuroscience;2022-05-18