Abstract
The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with language knowledge, AR is more challenging than other language-agnostic audio classification tasks. In this paper, we use an auxiliary automatic speech recognition (ASR) task to extract language-related phonetic features. Furthermore, we propose a hybrid structure that incorporates the embeddings of both a fixed acoustic model and a trainable acoustic model, making the language-related acoustic feature more robust. We conduct several experiments on the AESRC dataset. The results demonstrate that our approach can obtain an 8.02% relative improvement compared with the Transformer baseline, showing the merits of the proposed method.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference30 articles.
1. Accented Speech Recognition Inspired by Human Perceptionhttps://arxiv.org/pdf/2104.04627.pdf
2. End-to-End Accented Speech Recognition
3. Multi-Task Learning with Deep Neural Networks: A Surveyhttps://arxiv.org/abs/2009.09796
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Automatic Accent Identification Using Less Data: a Shift from Global to Segmental Accent;Arabian Journal for Science and Engineering;2024-08-13
2. Transfer learning methods for low-resource speech accent recognition: A case study on Vietnamese language;Engineering Applications of Artificial Intelligence;2024-06
3. Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024
4. Probing Speech Quality Information in ASR Systems;INTERSPEECH 2023;2023-08-20
5. Improving Vietnamese Accent Recognition Using ASR Transfer Learning;2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA);2022-11