Affiliation:
1. Anhui Art College, Hefei, Anhui Province 230011, China
Abstract
Audio scene recognition is a task that enables devices to understand their environment through digital audio analysis. It belongs to a branch of the field of computer auditory scene. At present, this technology has been widely used in intelligent wearable devices, robot sensing services, and other application scenarios. In order to explore the applicability of machine learning technology in the field of digital audio scene recognition, an audio scene recognition method based on optimized audio processing and convolutional neural network is proposed. Firstly, different from the traditional audio feature extraction method using mel-frequency cepstrum coefficient, the proposed method uses binaural representation and harmonic percussive source separation method to optimize the original audio and extract the corresponding features, so that the system can make use of the spatial features of the scene and then improve the recognition accuracy. Then, an audio scene recognition system with two-layer convolution module is designed and implemented. In terms of network structure, we try to learn from the VGGNet structure in the field of image recognition to increase the network depth and improve the system flexibility. Experimental data analysis shows that compared with traditional machine learning methods, the proposed method can greatly improve the recognition accuracy of each scene and achieve better generalization effect on different data.
Subject
Computer Science Applications,Software
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献