Efficient Melody Extraction Based on Extreme Learning Machine-Reference-Cited by-同舟云学术

Efficient Melody Extraction Based on Extreme Learning Machine

Published:2020-03-25 Issue:7 Volume:10 Page:2213
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Zhang Weiwei,Zhang Qiaoling,Bi Sheng,Fang Shaojun,Dai Jinliang

Abstract

Melody extraction is an important task in music information retrieval community and it is unresolved due to the complex nature of real-world recordings. In this paper, the melody extraction problem is addressed in the extreme learning machine (ELM) framework. More specifically, the input musical signal is first pre-processed to mimic the human auditory system. The music features are then constructed by constant-Q transform (CQT), and the concentration strategy is introduced to make use of contextual information. Afterwards, the rough melody pitches are determined by ELM network, according to its pre-trained parameters. Finally, the rough melody pitches are fine-tuned by the spectral peaks around the frame-wise rough pitches. The proposed method can extract melody from polyphonic music efficiently and effectively, where pitch estimation and voicing detection are conducted jointly. Some experiments have been conducted based on three publicly available datasets. The experimental results reveal that the proposed method achieves higher overall accuracies with very fast speed.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Liaoning Province

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/10/7/2213/pdf

Reference29 articles.

1. Melody Extraction from Polyphonic Music Signals: Approaches, applications, and challenges

2. Tonal representations for music retrieval: from version identification to query-by-humming

3. Melody Extraction Using Chroma-Level Note Tracking and Pitch Mapping

4. A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals