Abstract
Abstract
Speech recognition has become a widely researched topic for decades, and there were already some successful products which has been put into commercial use, like Siri. However, sometimes it is hard to distinguish the word because words of different accents have different pronunciations. For instance, in Japanese English /r/ is usually pronounced as /l/. Therefore, it is natural to think that speech recognition could be divided into two parts. First attach a label to the audio about the accent, then recognize the contents based on the regular pattern of that accent. In this paper, we researched on several characteristics including voice onset region(VOR), vowels and formants to distinguish British English and American English. By applying both linear neural network and neural network with nonlinear classifications and two hidden layers(NN2HL), the accuracy rate reaches 86.67%, which is very satisfying.
Subject
General Physics and Astronomy
Reference9 articles.
1. Deep neural networks for acoustic modeling in speech recognition;Senior,2012
2. Deep speech 2: End-to-end speech recognition in english and mandarin;Amodei;Proceedings of the 33rd International Conference on International Conference on Machine Learning,2016
3. Achieving human parity in conversational speech recognition;Xiong,2016
4. Fast and accurate recurrent neural network acoustic models for speech recognition;Sak,2015
5. A weighted accent classification using multiple words;Rizwan;Neuro-computing,2018
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献