Abstract
AbstractThis paper presents a proposal to a source-filter theory of voice production, more precisely related to voiced sounds. It is a proposal of a model to generate signal using linear and time-invariant systems and takes into account the phonation biophysics and the cyclostationary characteristics of the voice signal, related to the vibrational behavior of the vocal cords. The model suggests that the oscillation frequency of the vocal cords is a function of its mass and length, but controlled by the longitudinal tension applied to them. The mathematical description of the model of glottal excitation is presented, along with a mathematical closed expression for the power spectral density of the signal that excites the glottis. The voice signal, whose parameters can be adjusted for detection and classification of glottis pathologies, is also present. As a result, the output of each block diagram that represents the proposed model is analysed, including a power spectral density comparison between emulated voice, original voice, and classic source-filter model. The Log Spectral Distortion is computed, providing values below 1.40 dB, indicating an acceptable distortion for all cases.
Publisher
Springer Science and Business Media LLC
Reference28 articles.
1. J. Van den Berg, Myoelastic-aerodynamic theory of voice production. J. Speech Hear. Res.1:, 227–244 (1958).
2. I. R. Titze, Comments on the myoelastic-aerodynamic theory of phonation. J. Acoust. Soc. Am.23:, 495–510 (1980).
3. T. B. Patel, H. A. Patil, in The 9th International Symposium on Chinese Spoken Language Processing. Novel Approach for Estimating Length of the Vocal Folds using Fujisaki Model (IEEESingapore, 2014), pp. 308–312. https://doi.org/10.1109/ISCSLP.2014.6936673.
4. L. J. Raphael, G. J. Borden, K. S. Harris, Speech Science Primer, Sixth edition (LWW, 2011).
5. G. Fant, Acoustic Theory of Speech Production (The Hague, Paris, 1970).