Abstract
AbstractHuman movement studies and analyses have been fundamental in many scientific domains, ranging from neuroscience to education, pattern recognition to robotics, health care to sports, and beyond. Previous speech motor models were proposed to understand how speech movement is produced and how the resulting speech varies when some parameters are changed. However, the inverse approach, in which the muscular response parameters and the subject’s age are derived from real continuous speech, is not possible with such models. Instead, in the handwriting field, the kinematic theory of rapid human movements and its associated Sigma-lognormal model have been applied successfully to obtain the muscular response parameters. This work presents a speech kinematics-based model that can be used to study, analyze, and reconstruct complex speech kinematics in a simplified manner. A method based on the kinematic theory of rapid human movements and its associated Sigma-lognormal model are applied to describe and to parameterize the asymptotic impulse response of the neuromuscular networks involved in speech as a response to a neuromotor command. The method used to carry out transformations from formants to a movement observation is also presented. Experiments carried out with the (English) VTR-TIMIT database and the (German) Saarbrucken Voice Database, including people of different ages, with and without laryngeal pathologies, corroborate the link between the extracted parameters and aging, on the one hand, and the proportion between the first and second formants required in applying the kinematic theory of rapid human movements, on the other. The results should drive innovative developments in the modeling and understanding of speech kinematics.
Funder
Ministerio de Economía, Industria y Competitividad, Gobierno de España
Interreg
Natural Sciences and Engineering Research Council of Canada
Ministerio de Ciencia, Innovación y Universidades
ministerio de educación y formación profesional
Publisher
Springer Science and Business Media LLC
Subject
Cognitive Neuroscience,Computer Science Applications,Computer Vision and Pattern Recognition
Reference59 articles.
1. Guenther FH. Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. Psychol Rev. 1995;102(3):594–621.
2. Parrell B, Lammert AC, Ciccarelli G, Quatieri TF. Current models of speech motor control: a control-theoretic overview of architectures and properties. J Acoust Soc Am. 2019;145(3):1456–81.
3. Perrier P, Ma L, Payan Y. Modeling the production of VCV sequences via the inversion of a biomechanical model of the tongue. 9th Eur Conf Speech Commun Technol. 2019;1041–4.
4. Patri JF, Diard J, Perrier P. Optimal speech motor control and token-to-token variability: a Bayesian modeling approach. Biol Cybern. 2015;109(6):611–26.
5. Kröger BJ, Kannampuzha J, Neuschaefer-Rube C. Towards a neurocomputational model of speech production and perception. Speech Commun. 2009;51(9):793–809.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Machine Learning Approach to Analyze the Effects of Alzheimer’s Disease on Handwriting Through Lognormal Features;Graphonomics in Human Body Movement. Bridging Research and Practice from Motor Control to Handwriting Analysis and Recognition;2023
2. Lognormality: An Open Window on Neuromotor Control;Graphonomics in Human Body Movement. Bridging Research and Practice from Motor Control to Handwriting Analysis and Recognition;2023
3. Lognormal Features for Early Diagnosis of Alzheimer’s Disease Through Handwriting Analysis;Lecture Notes in Computer Science;2022