Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language-Reference-Cited by-同舟云学术

Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language

Published:2022-06-23 Issue:7 Volume:12 Page:818
ISSN:2076-3425
Container-title:Brain Sciences
language:en
Short-container-title:Brain Sciences

Author:

Li Huiyan^ORCID,Lin Haohong,Wang You,Wang Hengyang,Zhang Ming^ORCID,Gao Han^ORCID,Ai Qing,Luo Zhiyuan,Li Guang

Abstract

Silent speech decoding (SSD), based on articulatory neuromuscular activities, has become a prevalent task of brain–computer interfaces (BCIs) in recent years. Many works have been devoted to decoding surface electromyography (sEMG) from articulatory neuromuscular activities. However, restoring silent speech in tonal languages such as Mandarin Chinese is still difficult. This paper proposes an optimized sequence-to-sequence (Seq2Seq) approach to synthesize voice from the sEMG-based silent speech. We extract duration information to regulate the sEMG-based silent speech using the audio length. Then, we provide a deep-learning model with an encoder–decoder structure and a state-of-the-art vocoder to generate the audio waveform. Experiments based on six Mandarin Chinese speakers demonstrate that the proposed model can successfully decode silent speech in Mandarin Chinese and achieve a character error rate (CER) of 6.41% on average with human evaluation.

Funder

the Science Foundation of Chinese Aerospace Industry

Publisher

MDPI AG

Subject

General Neuroscience

Link

https://www.mdpi.com/2076-3425/12/7/818/pdf

Reference55 articles.

1. Silent speech interfaces

2. Silent Speech Decoding Using Spectrogram Features Based on Neuromuscular Activities

3. Biosignal-Based Spoken Communication: A Survey