Affiliation:
1. Anshan Normal University Liaoning China Anshan China
Abstract
Song recognition refers to automatically recognizing the corresponding song name for the input audio clip. Because of its friendly interactive form and convenience, song recognition has become a hot topic in the research of music retrieval. However, most of the existing song recognition methods assume that the collected audios are clean data. Unfortunately, in practical applications, they often face problems such as the low price of the acquisition equipment and the serious noise pollution of the collected audio data, resulting in poor recognition accuracy. To solve the above problems, facing data engineering and low‐cost microphone scenario, this paper proposes a deep learning based two‐stage song recognition framework. Specifically, the Denoising Auto‐Encoder network is first used for speech enhancement to obtain clean audio data. Then, the Con‐LSTM network is proposed for clean song recognition. More specifically, Con‐LSTM network integrates the advantages of convolutional neural network (CNN) and recurrent neural network (RNN), thus it has stronger recognition ability. The final experimental results show that the proposed song recognition framework can effectively identify the songs collected by low‐cost microphones. As such, the proposed framework can be embedded in the web of things (WoT) system for well help to improve speech recognition task, which are essential in many advanced WoT systems
Subject
Artificial Intelligence,Computer Networks and Communications,Information Systems,Software