Frame-Based Phone Classification Using EMG Signals-Reference-Cited by-同舟云学术

Frame-Based Phone Classification Using EMG Signals

Published:2023-06-30 Issue:13 Volume:13 Page:7746
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Salomons Inge¹^ORCID,del Blanco Eder¹^ORCID,Navas Eva¹^ORCID,Hernáez Inma¹^ORCID,de Zuazo Xabier¹

Affiliation:

1. HiTZ Basque Center for Language Technology, University of the Basque Country, Ingeniero Torres Quevedo Plaza, 1, 48013 Bilbao, Spain

Abstract

This paper evaluates the impact of inter-speaker and inter-session variability on the development of a silent speech interface (SSI) based on electromyographic (EMG) signals from the facial muscles. The final goal of the SSI is to provide a communication tool for Spanish-speaking laryngectomees by generating audible speech from voiceless articulation. However, before moving on to such a complex task, a simpler phone classification task in different modalities regarding speaker and session dependency is performed for this study. These experiments consist of processing the recorded utterances into phone-labeled segments and predicting the phonetic labels using only features obtained from the EMG signals. We evaluate and compare the performance of each model considering the classification accuracy. Results show that the models are able to predict the phonetic label best when they are trained and tested using data from the same session. The accuracy drops drastically when the model is tested with data from a different session, although it improves when more data are added to the training data. Similarly, when the same model is tested on a session from a different speaker, the accuracy decreases. This suggests that using larger amounts of data could help to reduce the impact of inter-session variability, but more research is required to understand if this approach would suffice to account for inter-speaker variability as well.

Funder

Agencia Estatal de Investigación

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/13/7746/pdf

Reference47 articles.

1. Hernaez, I., Gonzalez Lopez, J.A., Navas, E., Pérez Córdoba, J.L., Saratxaga, I., Olivares, G., Sanchez de la Fuente, J., Galdón, A., Garcia, V., and Castillo, J.d. (2022, January 14–16). ReSSInt project: Voice restoration using Silent Speech Interfaces. Proceedings of the IberSPEECH 2022, ISCA, Granada, Spain.

2. Voice Restoration after Total Laryngectomy;Tang;Otolaryngol. Clin. N. Am.,2015

3. Zieliński, K., and Rączaszek-Leonardi, J. (May, January 29). A Complex Human-Machine Coordination Problem: Essential Constraints on Interaction Control in Bionic Communication Systems. Proceedings of the CHI Conference on Human Factors in Computing Systems Extended Abstracts, New Orleans, LA, USA.

4. Wand, M., Janke, M., and Schultz, T. (2014, January 14–18). The EMG-UKA corpus for electromyographic speech processing. Proceedings of the Interspeech 2014, Singapore.

5. Gaddy, D., and Klein, D. (2020). Digital voicing of silent speech. arXiv.