Author:
Goudarzi Alireza,Moya-Galé Gemma
Abstract
The sophistication of artificial intelligence (AI) technologies has significantly advanced in the past decade. However, the observed unpredictability and variability of AI behavior in noisy signals is still underexplored and represents a challenge when trying to generalize AI behavior to real-life environments, especially for people with a speech disorder, who already experience reduced speech intelligibility. In the context of developing assistive technology for people with Parkinson's disease using automatic speech recognition (ASR), this pilot study reports on the performance of Google Cloud speech-to-text technology with dysarthric and healthy speech in the presence of multi-talker babble noise at different intensity levels. Despite sensitivities and shortcomings, it is possible to control the performance of these systems with current tools in order to measure speech intelligibility in real-life conditions.
Reference35 articles.
1. Deep speech 2: end-to-end speech recognition in English and mandarin,;Amodei,2016
2. Deep voice: real-time neural text-to-speech,;Arik,2017
3. The English Lexicon project;Balota;Behav. Res. Methods,2007
4. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition,;Chan,2016
5. Predicting intelligibility deficits in Parkinson's disease with perceptual speech ratings;Chiu;J. Speech Lang. Hear. Res,2020
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献