Speech Recognition and Listening Effort of Meaningful Sentences Using Synthetic Speech-Reference-Cited by-同舟云学术

Speech Recognition and Listening Effort of Meaningful Sentences Using Synthetic Speech

Published:2022-01 Issue: Volume:26 Page:233121652211306
ISSN:2331-2165
Container-title:Trends in Hearing
language:en
Short-container-title:Trends in Hearing

Author:

Ibelings Saskia¹²³^ORCID,Brand Thomas²³,Holube Inga¹³

Affiliation:

1. Institute of Hearing Technology and Audiology, Jade University of Applied Sciences, Oldenburg, Germany

2. Medizinische Physik, Universität Oldenburg, Oldenburg, Germany

3. Cluster of Excellence Hearing4All, Oldenburg, Germany

Abstract

Speech-recognition tests are an important component of audiology. However, the development of such tests can be time consuming. The aim of this study was to investigate whether a Text-To-Speech (TTS) system can reduce the cost of development, and whether comparable results can be achieved in terms of speech recognition and listening effort. For this, the everyday sentences of the German Göttingen sentence test were synthesized for both a female and a male speaker using a TTS system. In a preliminary study, this system was rated as good, but worse than the natural reference. Due to the Covid-19 pandemic, the measurements took place online. Each set of speech material was presented at three fixed signal-to-noise ratios. The participants’ responses were recorded and analyzed offline. Compared to the natural speech, the adjusted psychometric functions for the synthetic speech, independent of the speaker, resulted in an improvement of the speech-recognition threshold (SRT) by approximately 1.2 dB. The slopes, which were independent of the speaker, were about 15 percentage points per dB. The time periods between the end of the stimulus presentation and the beginning of the verbal response (verbal response time) were comparable for all speakers, suggesting no difference in listening effort. The SRT values obtained in the online measurement for the natural speech were comparable to published data. In summary, the time and effort for the development of speech-recognition tests may be significantly reduced by using a TTS system. This finding provides the opportunity to develop new speech tests with a large amount of speech material.

Funder

Graduation program of Jade University of Applied Sciences

Publisher

SAGE Publications

Subject

Speech and Hearing,Otorhinolaryngology

Link

http://journals.sagepub.com/doi/pdf/10.1177/23312165221130656

Reference51 articles.

1. Realistic precision and accuracy of online experiment platforms, web browsers, and devices

2. Analyzing reaction times

3. Boersma P., Weenink D. (2007). PRAAT: Doing phonetics by computer (Version 5.3.51).

4. Efficient adaptive procedures for threshold and concurrent slope estimates for psychophysics and speech intelligibility tests

5. Speech Synthesis: Toward a “Voice” for All

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Global access to speech hearing tests;2024-06-13

2. Measurement and optimisation of the perceptual equivalence of the Dutch consonant-vowel-consonant (CVC) word lists using synthetic speech and list pairs;International Journal of Audiology;2024-02-07

3. Development and validation of a French speech-in-noise self-test using synthetic voice in an adult population;Frontiers in Audiology and Otology;2024-01-26

4. Development of a Phrase-Based Speech-Recognition Test Using Synthetic Speech;Trends in Hearing;2024-01