Robust neural tracking of linguistic speech representations using a convolutional neural network-Reference-Cited by-同舟云学术

Robust neural tracking of linguistic speech representations using a convolutional neural network

Published:2023-08-01 Issue:4 Volume:20 Page:046040
ISSN:1741-2560
Container-title:Journal of Neural Engineering
language:
Short-container-title:J. Neural Eng.

Author:

Puffay Corentin^ORCID,Vanthornhout Jonas^ORCID,Gillis Marlies^ORCID,Accou Bernd^ORCID,Van hamme Hugo^ORCID,Francart Tom^ORCID

Abstract

Abstract Objective. When listening to continuous speech, populations of neurons in the brain track different features of the signal. Neural tracking can be measured by relating the electroencephalography (EEG) and the speech signal. Recent studies have shown a significant contribution of linguistic features over acoustic neural tracking using linear models. However, linear models cannot model the nonlinear dynamics of the brain. To overcome this, we use a convolutional neural network (CNN) that relates EEG to linguistic features using phoneme or word onsets as a control and has the capacity to model non-linear relations. Approach. We integrate phoneme- and word-based linguistic features (phoneme surprisal, cohort entropy (CE), word surprisal (WS) and word frequency (WF)) in our nonlinear CNN model and investigate if they carry additional information on top of lexical features (phoneme and word onsets). We then compare the performance of our nonlinear CNN with that of a linear encoder and a linearized CNN. Main results. For the non-linear CNN, we found a significant contribution of CE over phoneme onsets and of WS and WF over word onsets. Moreover, the non-linear CNN outperformed the linear baselines. Significance. Measuring coding of linguistic features in the brain is important for auditory neuroscience research and applications that involve objectively measuring speech understanding. With linear models, this is measurable, but the effects are very small. The proposed non-linear CNN model yields larger differences between linguistic and lexical models and, therefore, could show effects that would otherwise be unmeasurable and may, in the future, lead to improved within-subject measures and shorter recordings.

Funder

KU Leuven

Fonds Wetenschappelijk Onderzoek

Publisher

IOP Publishing

Subject

Cellular and Molecular Neuroscience,Biomedical Engineering

Link

https://iopscience.iop.org/article/10.1088/1741-2552/acf1ce/pdf

Reference38 articles.

1. Tensorflow: large-scale machine learning on heterogeneous distributed systems;Abadi,2015

2. Modeling the relationship between acoustic stimulus and eeg with a dilated convolutional neural network;Accou,2021b

3. Predicting speech intelligibility from eeg using a dilated convolutional network;Accou,2021a

4. Auditory brainstem response to complex sounds predicts self-reported speech-in-noise performance;Anderson;J. Speech, Lang. Hear. Res.,2013

5. A large auditory EEG decoding dataset;Bollens,2023

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Classifying coherent versus nonsense speech perception from EEG using linguistic speech features;Scientific Reports;2024-08-14

2. Classifying native versus foreign speech perception from EEG using linguistic speech features;2024-04-15

3. Self-Supervised Speech Representation and Contextual Text Embedding for Match-Mismatch Classification with EEG Recording;2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW);2024-04-14

4. Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks;IEEE Open Journal of Signal Processing;2024