Feedback From Automatic Speech Recognition to Elicit Clear Speech in Healthy Speakers-Reference-Cited by-同舟云学术

Feedback From Automatic Speech Recognition to Elicit Clear Speech in Healthy Speakers

Published:2023-11-06 Issue:6 Volume:32 Page:2940-2959
ISSN:1058-0360
Container-title:American Journal of Speech-Language Pathology
language:en
Short-container-title:Am J Speech Lang Pathol

Author:

Gutz Sarah E.¹²^ORCID,Maffei Marc F.¹^ORCID,Green Jordan R.¹²^ORCID

Affiliation:

1. Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA

2. Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA

Abstract

Purpose: This study assessed the effectiveness of feedback generated by automatic speech recognition (ASR) for eliciting clear speech from young, healthy individuals. As a preliminary step toward exploring a novel method for eliciting clear speech in patients with dysarthria, we investigated the effects of ASR feedback in healthy controls. If successful, ASR feedback has the potential to facilitate independent, at-home clear speech practice. Method: Twenty-three healthy control speakers (ages 23–40 years) read sentences aloud in three speaking modes: Habitual, Clear (over-enunciated), and in response to ASR feedback (ASR). In the ASR condition, we used Mozilla DeepSpeech to transcribe speech samples and provide participants with a value indicating the accuracy of the ASR's transcription. For speakers who achieved sufficiently high ASR accuracy, noise was added to their speech at a participant-specific signal-to-noise ratio to ensure that each participant had to over-enunciate to achieve high ASR accuracy. Results: Compared to habitual speech, speech produced in the ASR and Clear conditions was clearer, as rated by speech-language pathologists, and more intelligible, per speech-language pathologist transcriptions. Speech in the Clear and ASR conditions aligned on several acoustic measures, particularly those associated with increased vowel distinctiveness and decreased speaking rate. However, ASR accuracy, intelligibility, and clarity were each correlated with different speech features, which may have implications for how people change their speech for ASR feedback. Conclusions: ASR successfully elicited outcomes similar to clear speech in healthy speakers. Future work should investigate its efficacy in eliciting clear speech in people with dysarthria.

Publisher

American Speech Language Hearing Association

Subject

Speech and Hearing,Linguistics and Language,Developmental and Educational Psychology,Otorhinolaryngology

Link

http://pubs.asha.org/doi/pdf/10.1044/2023_AJSLP-23-00030

Reference103 articles.

1. Shorter Sentence Length Maximizes Intelligibility and Speech Motor Performance in Persons With Dysarthria Due to Amyotrophic Lateral Sclerosis

2. Automatic speech recognition and speech variability: A review

3. Do principles of motor learning enhance retention and transfer of speech skills? A systematic review

4. Boersma P. & Weenink D. (2006). Praat (Version 4.5) [Computer software]. Institute of Phonetic Sciences.

5. The clear speech effect for non-native listeners