Author:
Zhang Weiwei,Zhang Qiaoling,Bi Sheng,Fang Shaojun,Dai Jinliang
Abstract
Melody extraction is an important task in music information retrieval community and it is unresolved due to the complex nature of real-world recordings. In this paper, the melody extraction problem is addressed in the extreme learning machine (ELM) framework. More specifically, the input musical signal is first pre-processed to mimic the human auditory system. The music features are then constructed by constant-Q transform (CQT), and the concentration strategy is introduced to make use of contextual information. Afterwards, the rough melody pitches are determined by ELM network, according to its pre-trained parameters. Finally, the rough melody pitches are fine-tuned by the spectral peaks around the frame-wise rough pitches. The proposed method can extract melody from polyphonic music efficiently and effectively, where pitch estimation and voicing detection are conducted jointly. Some experiments have been conducted based on three publicly available datasets. The experimental results reveal that the proposed method achieves higher overall accuracies with very fast speed.
Funder
National Natural Science Foundation of China
Natural Science Foundation of Liaoning Province
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science