Abstract
In this paper, we present the XMUSPEECH systems for Track 2 of the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC2020). Track 2 is an Automatic Speech Recognition (ASR) task where the non-native English speakers have various accents, which reduces the accuracy of the ASR system. To solve this problem, we experimented with acoustic models and input features. Furthermore, we trained a TDNN-LSTM language model for lattice rescoring to obtain better results. Compared with our baseline system, we achieved relative word error rate (WER) improvements of 40.7% and 35.7% on the development set and evaluation set, respectively.
Funder
National Natural Science Foundation of China
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. English Speech Recognition Model Based on Improved Neural Network;2023 International Conference on Network, Multimedia and Information Technology (NMITCON);2023-09-01
2. Early Fusion of Phone Embeddings for Recognition of Low-Resourced Accented Speech;2022 4th International Conference on Artificial Intelligence and Speech Technology (AIST);2022-12-09
3. Pseudo-Phoneme Label Loss for Text-Independent Speaker Verification;Applied Sciences;2022-07-25