Speech recognition for under-resourced languages: Data sharing in hidden Markov model systems-Reference-Cited by-同舟云学术

Speech recognition for under-resourced languages: Data sharing in hidden Markov model systems

Published:2017-01-23 Issue:Number 1/2 Volume:Volume 113 Page:
ISSN:1996-7489
Container-title:South African Journal of Science
language:en
Short-container-title:S. Afr. J. Sci

Author:

de Wet Febe,Kleynhans Neil,van Compernolle Dirk,Sahraeian Reza, , , ,

Abstract

Abstract For purposes of automated speech recognition in under-resourced environments, techniques used to share acoustic data between closely related or similar languages become important. Donor languages with abundant resources can potentially be used to increase the recognition accuracy of speech systems developed in the resource poor target language. The assumption is that adding more data will increase the robustness of the statistical estimations captured by the acoustic models. In this study we investigated data sharing between Afrikaans and Flemish – an under-resourced and well-resourced language, respectively. Our approach was focused on the exploration of model adaptation and refinement techniques associated with hidden Markov model based speech recognition systems to improve the benefit of sharing data. Specifically, we focused on the use of currently available techniques, some possible combinations and the exact utilisation of the techniques during the acoustic model development process. Our findings show that simply using normal approaches to adaptation and refinement does not result in any benefits when adding Flemish data to the Afrikaans training pool. The only observed improvement was achieved when developing acoustic models on all available data but estimating model refinements and adaptations on the target data only.

Publisher

Academy of Science of South Africa

Subject

General Earth and Planetary Sciences,General Agricultural and Biological Sciences,General Biochemistry, Genetics and Molecular Biology

Link

http://sajs.co.za/sites/default/files/publications/pdf/SAJS-113-1-2_de-Wet_ResearchArticle.pdf

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Deploying a Speech Recognition Model for Under-Resourced Languages: A Case Study on Dioula Wake Words 1, 2, 3, and 4;Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval;2023-12-15

2. A Review on Speech Recognition for Under-Resourced Languages;International Journal of Knowledge and Systems Science;2023-10-27

3. Analytical Review of Methods for Solving Data Scarcity Issues Regarding Elaboration of Automatic Speech Recognition Systems for Low-Resource Languages;Informatics and Automation;2022-07-08

4. Optimization of Intelligent English Pronunciation Training System Based on Android Platform;Complexity;2021-03-26

5. Determining the adaptation data saturation of ASR systems for dysarthric speakers;International Journal of Speech Technology;2021-01-02