Decoding Speech and Music Stimuli from the Frequency Following Response-Reference-Cited by-同舟云学术

Decoding Speech and Music Stimuli from the Frequency Following Response

Published:2019-06-05 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Losorelli Steven,Kaneshiro Blair,Musacchia Gabriella A.,Blevins Nikolas H.,Fitzgerald Matthew B.

Abstract

AbstractThe ability to differentiate complex sounds is essential for communication. Here, we propose using a machine-learning approach, called classification, to objectively evaluate auditory perception. In this study, we recorded frequency following responses (FFRs) from 13 normal-hearing adult participants to six short music and speech stimuli sharing similar fundamental frequencies but varying in overall spectral and temporal characteristics. Each participant completed a perceptual identification test using the same stimuli. We used linear discriminant analysis to classify FFRs. Results showed statistically significant FFR classification accuracies using both the full response epoch in the time domain (72.3% accuracy, p < 0.001) as well as real and imaginary Fourier coefficients up to 1 kHz (74.6%, p < 0.001). We classified decomposed versions of the responses in order to examine which response features contributed to successful decoding. Classifier accuracies using Fourier magnitude and phase alone in the same frequency range were lower but still significant (58.2% and 41.3% respectively, p < 0.001). Classification of overlapping 20-msec subsets of the FFR in the time domain similarly produced reduced but significant accuracies (42.3%–62.8%, p < 0.001). Participants’ mean perceptual responses were most accurate (90.6%, p < 0.001). Confusion matrices from FFR classifications and perceptual responses were converted to distance matrices and visualized as dendrograms. FFR classifications and perceptual responses demonstrate similar patterns of confusion across the stimuli. Our results demonstrate that classification can differentiate auditory stimuli from FFR responses with high accuracy. Moreover, the reduced accuracies obtained when the FFR is decomposed in the time and frequency domains suggest that different response features contribute complementary information, similar to how the human auditory system is thought to rely on both timing and frequency information to accurately process sound. Taken together, these results suggest that FFR classification is a promising approach for objective assessment of auditory perception.

Publisher

Cold Spring Harbor Laboratory

Reference73 articles.

1. Envelope and spectral frequency-following responses to vowel sounds

2. Development of subcortical speech representation in human infants

3. Atcherson, S. S. , & Stoody, T. M. (2012). Introduction to auditory evoked potentials. In S. S. Atcherson & T.M. Stoody (Eds.), Auditory electrophysiology (pp. 1–7). New York, NY: Thieme.

4. Atcherson, S. S. , & White, L. (2012). Cortical event-related potentials. In S. S. Atcherson & T. M. Stoody (Eds.), Auditory electrophysiology (pp. 138–160). New York, NY: Thieme.

5. Sensory-based learning disability: Insights from brainstem processing of speech sounds