Abstract
Abstract
Objective. Silent speech recognition (SSR) based on surface electromyography (sEMG) is an attractive non-acoustic modality of human-machine interfaces that convert the neuromuscular electrophysiological signals into computer-readable textual messages. The speaking process involves complex neuromuscular activities spanning a large area over the facial and neck muscles, thus the locations of the sEMG electrodes considerably affected the performance of the SSR system. However, most of the previous studies used only a quite limited number of electrodes that were placed empirically without prior quantitative analysis, resulting in uncertainty and unreliability of the SSR outcomes. Approach. In this study, the technique of high-density sEMG was proposed to provide a full representation of the articulatory muscle activities so that the optimal electrode configuration for SSR could be systemically explored. A total of 120 closely spaced electrodes were placed on the facial and neck muscles to collect the high-density sEMG signals for classifying ten digits (0–9) silently spoken in both English and Chinese. The sequential forward selection algorithm was adopted to explore the optimal electrodes configurations. Main Results. The results showed that the classification accuracy increased rapidly and became saturated quickly when the number of selected electrodes increased from 1 to 120. Using only ten optimal electrodes could achieve a classification accuracy of 86% for English and 94% for Chinese, whereas as many as 40 non-optimized electrodes were required to obtain comparable accuracies. Also, the optimally selected electrodes seemed to be mostly distributed on the neck instead of the facial region, and more electrodes were required for English recognition to achieve the same accuracy. Significance. The findings of this study can provide useful guidelines about electrode placement for developing a clinically feasible SSR system and implementing a promising approach of human-machine interface, especially for patients with speaking difficulties.
Funder
Science and Technology Planning Project of Shenzhen
Science and Technology Program of Guangzhou
Shenzhen Governmental Basic Research Grant
National Natural Science Foundation of China
Shenzhen Science and Technology Development Fund
Science and Technology Planning Project of Guangdong Province
Subject
Cellular and Molecular Neuroscience,Biomedical Engineering
Reference52 articles.
1. Synchrony-based feature extraction for robust automatic speech recognition;De-la-calle-silos;IEEE Signal Process. Lett.,2017
2. Sound source separation for plural passenger speech recognition in smart mobility system;Fukui;IEEE Trans. Consum. Electron.,2018
3. An overview of noise-robust automatic speech recognition;Li;IEEE/ACM Trans. Audio, Speech, Language Process.,2014
4. Unsupervised speech enhancement based on multichannel NMF-informed beamforming for noise-robust automatic speech recognition;Shimada;IEEE/ACM Trans. Audio, Speech, Language Process.,2019
5. Multichannel signal processing with deep neural networks for automatic speech recognition;Sainath;IEEE/ACM Trans. Audio, Speech, Language Process.,2017
Cited by
18 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献