Iterative alignment discovery of speech-associated neural activity-Reference-Cited by-同舟云学术

Iterative alignment discovery of speech-associated neural activity

Published:2024-08-01 Issue:4 Volume:21 Page:046056
ISSN:1741-2560
Container-title:Journal of Neural Engineering
language:
Short-container-title:J. Neural Eng.

Author:

Rabbani Qinwan^ORCID,Shah Samyak^ORCID,Milsap Griffin^ORCID,Fifer Matthew^ORCID,Hermansky Hynek^ORCID,Crone Nathan^ORCID

Abstract

Abstract Objective. Brain–computer interfaces (BCIs) have the potential to preserve or restore speech in patients with neurological disorders that weaken the muscles involved in speech production. However, successful training of low-latency speech synthesis and recognition models requires alignment of neural activity with intended phonetic or acoustic output with high temporal precision. This is particularly challenging in patients who cannot produce audible speech, as ground truth with which to pinpoint neural activity synchronized with speech is not available. Approach. In this study, we present a new iterative algorithm for neural voice activity detection (nVAD) called iterative alignment discovery dynamic time warping (IAD-DTW) that integrates DTW into the loss function of a deep neural network (DNN). The algorithm is designed to discover the alignment between a patient’s electrocorticographic (ECoG) neural responses and their attempts to speak during collection of data for training BCI decoders for speech synthesis and recognition. Main results. To demonstrate the effectiveness of the algorithm, we tested its accuracy in predicting the onset and duration of acoustic signals produced by able-bodied patients with intact speech undergoing short-term diagnostic ECoG recordings for epilepsy surgery. We simulated a lack of ground truth by randomly perturbing the temporal correspondence between neural activity and an initial single estimate for all speech onsets and durations. We examined the model’s ability to overcome these perturbations to estimate ground truth. IAD-DTW showed no notable degradation (<1% absolute decrease in accuracy) in performance in these simulations, even in the case of maximal misalignments between speech and silence. Significance. IAD-DTW is computationally inexpensive and can be easily integrated into existing DNN-based nVAD approaches, as it pertains only to the final loss computation. This approach makes it possible to train speech BCI algorithms using ECoG data from patients who are unable to produce audible speech, including those with Locked-In Syndrome.

Funder

National Institute of Neurological Disorders and Stroke

Publisher

IOP Publishing

Link

https://iopscience.iop.org/article/10.1088/1741-2552/ad663c/pdf

Reference51 articles.

1. Brain–computer interfaces for communication and control;Wolpaw;Clin. Neurophysiol.,2002

2. The potential for a speech brain–computer interface using chronic electrocorticography;Rabbani;Neurotherapeutics,2019

3. Joint spatial-spectral feature space clustering for speech activity detection from ECoG signals;Kanas;IEEE Trans. Biomed. Eng.,2014

4. Real-time voice activity detection for ECoG-based speech brain machine interfaces;Kanas,2014

5. Keyword spotting using human electrocorticographic recordings;Milsap;Front. Neurosci.,2019