Author:
A. Zahorian Stephen,Liu Xiaoyu,Sadeghian Roozbeh
Abstract
Based on extensive prior studies of speech science focused on the spectral-temporal properties of human speech perception, as well as a wide range of spectral-temporal speech features already in use, and motivated by the time-frequency resolution properties of human hearing, this chapter proposes and evaluates one general class of spectral-temporal features. These features, intended primarily for use in Automatic Speech Recognition (ASR) front ends, allow different realizations of general time-frequency concepts to be easily implemented and tuned through a set of frequency and time-warping functions. The methods presented are flexible enough to allow evaluation of the relative importance of the spectral and temporal features and to explore the trade-off between time and frequency resolution. Extensive ASR experiments were conducted to evaluate various spectral-temporal properties using this unified framework.