Author:
Zellou Georgia,Lahrouchi Mohamed
Abstract
AbstractTashlhiyt is a low-resource language with respect to acoustic databases, language corpora, and speech technology tools, such as Automatic Speech Recognition (ASR) systems. This study investigates whether a method of cross-language re-use of ASR is viable for Tashlhiyt from an existing commercially-available system built for Arabic. The source and target language in this case have similar phonological inventories, but Tashlhiyt permits typologically rare phonological patterns, including vowelless words, while Arabic does not. We find systematic disparities in ASR transfer performance (measured as word error rate (WER) and Levenshtein distance) for Tashlhiyt across word forms and speaking style variation. Overall, performance was worse for casual speaking modes across the board. In clear speech, performance was lower for vowelless than for voweled words. These results highlight systematic speaking mode- and phonotactic-disparities in cross-language ASR transfer. They also indicate that linguistically-informed approaches to ASR re-use can provide more effective ways to adapt existing speech technology tools for low resource languages, especially when they contain typologically rare structures. The study also speaks to issues of linguistic disparities in ASR and speech technology more broadly. It can also contribute to understanding the extent to which machines are similar to, or different from, humans in mapping the acoustic signal to discrete linguistic representations.
Publisher
Springer Science and Business Media LLC
Reference58 articles.
1. Ammari, T., Kaye, J., Tsai, J. Y. & Bentley, F. Music, search, and IoT: How people (really) use voice assistants. ACM Trans. Comput. Hum. Interact. (TOCHI) 26(3), 1–28 (2019).
2. Bentley, F. et al. Understanding the long-term use of smart speaker assistants. Proc. ACM Interactive Mobile Wearable Ubiquitous Technol. 2(3), 1–24 (2018).
3. Nakamura, S. Overcoming the language barrier with speech translation technology. NISTEP Science & Technology Foresight Center (2009).
4. Godwin-Jones, R. Mobile apps for language learning. Lang. Learn. Technol. 15(2), 2–11 (2011).
5. Godwin-Jones, R. Smartphones and language learning. Lang. Learn. Technol. 21(2), 3–17 (2017).
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Assessing Speech Intelligibility and Severity Level in Parkinson's Disease Using Wav2Vec 2.0;2024 47th International Conference on Telecommunications and Signal Processing (TSP);2024-07-10
2. Linguistic analysis of human-computer interaction;Frontiers in Computer Science;2024-05-21