Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation-Reference-Cited by-同舟云学术

Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation

Published:2023-07-10 Issue:7 Volume:25 Page:1039
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Zhao Wayne¹,Singh Rita²

Affiliation:

1. Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA

2. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA

Abstract

During phonation, the vocal folds exhibit a self-sustained oscillatory motion, which is influenced by the physical properties of the speaker’s vocal folds and driven by the balance of bio-mechanical and aerodynamic forces across the glottis. Subtle changes in the speaker’s physical state can affect voice production and alter these oscillatory patterns. Measuring these can be valuable in developing computational tools that analyze voice to infer the speaker’s state. Traditionally, vocal fold oscillations (VFOs) are measured directly using physical devices in clinical settings. In this paper, we propose a novel analysis-by-synthesis approach that allows us to infer the VFOs directly from recorded speech signals on an individualized, speaker-by-speaker basis. The approach, called the ADLES-VFT algorithm, is proposed in the context of a joint model that combines a phonation model (with a glottal flow waveform as the output) and a vocal tract acoustic wave propagation model such that the output of the joint model is an estimated waveform. The ADLES-VFT algorithm is a forward-backward algorithm which minimizes the error between the recorded waveform and the output of this joint model to estimate its parameters. Once estimated, these parameter values are used in conjunction with a phonation model to obtain its solutions. Since the parameters correlate with the physical properties of the vocal folds of the speaker, model solutions obtained using them represent the individualized VFOs for each speaker. The approach is flexible and can be applied to various phonation models. In addition to presenting the methodology, we show how the VFOs can be quantified from a dynamical systems perspective for classification purposes. Mathematical derivations are provided in an appendix for better readability.

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/25/7/1039/pdf

Reference54 articles.

1. Review on Mathematical and Mechanical Models of the Vocal Cord;Cveticanin;J. Appl. Math.,2012

2. The physics of small-amplitude oscillation of the vocal folds;Titze;J. Acoust. Soc. Am.,1988

3. Döllinger, M., Gómez, P., Patel, R.R., Alexiou, C., Bohr, C., and Schützenberger, A. (2017). Biomechanical simulation of vocal fold dynamics in adults based on laryngeal high-speed videoendoscopy. PLoS ONE, 12.

4. Electroglottographic wavegrams: A technique for visualizing vocal fold dynamics noninvasively;Herbst;J. Acoust. Soc. Am.,2010

5. Irregular vocal-fold vibration—High-speed observation and modeling;Mergell;J. Acoust. Soc. Am.,2000

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The Physics of the Human Vocal Folds as a Biological Oscillator;New Insights on Oscillators and Their Applications to Engineering and Science;2024-03-20

2. Confounding Factor Analysis for Vocal Fold Oscillations;Entropy;2023-11-23