Abstract
Singing Voice Detection (SVD) is a classification task that determines whether there is a singing voice in a given audio segment. While current systems produce high-quality results on this task, the reported experiments are usually limited to popular music. A Long-Term Recurrent Convolutional Network (LRCN) was adapted to detect vocals in a new dataset of electronic music to evaluate its performance in a different music genre and compare its results against those in other state-of-the-art experiments in pop music to prove its effectiveness across a different genre. Experiments on two datasets studied the impacts of different audio features and block size on LRCN temporal relationship learning, and the benefits of preprocessing on performance, and the results generate a benchmark to evaluate electronic music and its intricacies.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献