Generalized Spectral-Temporal Features for Representing Speech Information-Reference-Cited by-同舟云学术

Generalized Spectral-Temporal Features for Representing Speech Information

Published:2022-05-26 Issue: Volume: Page:
ISSN:
Container-title:Computational Semantics [Working Title]
language:
Short-container-title:

Author:

A. Zahorian Stephen,Liu Xiaoyu,Sadeghian Roozbeh

Abstract

Based on extensive prior studies of speech science focused on the spectral-temporal properties of human speech perception, as well as a wide range of spectral-temporal speech features already in use, and motivated by the time-frequency resolution properties of human hearing, this chapter proposes and evaluates one general class of spectral-temporal features. These features, intended primarily for use in Automatic Speech Recognition (ASR) front ends, allow different realizations of general time-frequency concepts to be easily implemented and tuned through a set of frequency and time-warping functions. The methods presented are flexible enough to allow evaluation of the relative importance of the spectral and temporal features and to explore the trade-off between time and frequency resolution. Extensive ASR experiments were conducted to evaluate various spectral-temporal properties using this unified framework.

Publisher

IntechOpen

Link

http://www.intechopen.com/download/pdf/81955

Reference52 articles.

1. Zahorian SA. Detailed Phonetic Labeling of Multi-Language Database for Spoken Language Processing Applications. Rome, NY, USA: Air Force Research Laboratory Information Directorate; 2015. Available from: http://www.oracle.com/us/corporate/citizenship/corporate-citizenship-report-2563684.pdf. DOI: 10.21236/ada614725

2. Peterson GE, Barney HL. Control methods used in a study of the vowels. The Journal of the Acoustical Society of America. 1952;24(2):175-184. DOI: 10.1121/1.1906875

3. Hermansky H. Perceptual linear prediction analysis of speech. The Journal of the Acoustical Society of America. 1990;87(4):1738-1752. DOI: 10.1121/1.399423

4. Weber K, Wet F, Cranen B, Bodes L, Bengio S, Bourlard H. Evaluation of formant-like features for ASR. Int. Conf. on Spoken Language (ICSLP). 2002. DOI: 10.1121/1.1781620

5. Garner P, Holmes W. On the robust incorporation of formant features into hidden Markov models for automatic speech recognition. Proceedings of ICASSP. 1998:1-4. DOI: 10.1109/ICASSP.1998.674352