Model Generation of Accented Speech using Model Transformation and Verification for Bilingual Speech Recognition-Reference-Cited by-同舟云学术

Model Generation of Accented Speech using Model Transformation and Verification for Bilingual Speech Recognition

Published:2015-04-20 Issue:2 Volume:14 Page:1-24
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Shen Han-ping¹,Wu Chung-hsien¹,Tsai Pei-shan¹

Affiliation:

1. National Cheng Kung University

Abstract

Nowadays, bilingual or multilingual speech recognition is confronted with the accent-related problem caused by non-native speech in a variety of real-world applications. Accent modeling of non-native speech is definitely challenging, because the acoustic properties in highly-accented speech pronounced by non-native speakers are quite divergent. The aim of this study is to generate highly Mandarin-accented English models for speakers whose mother tongue is Mandarin. First, a two-stage, state-based verification method is proposed to extract the state-level, highly-accented speech segments automatically. Acoustic features and articulatory features are successively used for robust verification of the extracted speech segments. Second, Gaussian components of the highly-accented speech models are generated from the corresponding Gaussian components of the native speech models using a linear transformation function. A decision tree is constructed to categorize the transformation functions and used for transformation function retrieval to deal with the data sparseness problem. Third, a discrimination function is further applied to verify the generated accented acoustic models. Finally, the successfully verified accented English models are integrated into the native bilingual phone model set for Mandarin-English bilingual speech recognition. Experimental results show that the proposed approach can effectively alleviate recognition performance degradation due to accents and can obtain absolute improvements of 4.1%, 1.8%, and 2.7% in word accuracy for bilingual speech recognition compared to that using traditional ASR approaches, MAP-adapted, and MLLR-adapted ASR methods, respectively.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/2661637

Reference55 articles.

1. A Tutorial on Text-Independent Speaker Verification

2. A novel characterization of the alternative hypothesis using kernel discriminant analysis for LLR-based speaker verification;Chao Y.-H.;Int. J. Comput. Linguist. Chinese Lang. Process.,2007

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. LIFA: Language identification from audio with LPCC-G features;Multimedia Tools and Applications;2023-12-14

2. Bilingual Automatic Speech Recognition: A Review, Taxonomy and Open Challenges;IEEE Access;2023

3. Natural language processing applications in library and information science;Online Information Review;2019-08-12