Pushing the envelope: Evaluating speech rhythm with different envelope extraction techniques-Reference-Cited by-同舟云学术

Pushing the envelope: Evaluating speech rhythm with different envelope extraction techniques

Published:2022-03 Issue:3 Volume:151 Page:2002-2026
ISSN:0001-4966
Container-title:The Journal of the Acoustical Society of America
language:en
Short-container-title:The Journal of the Acoustical Society of America

Author:

MacIntyre Alexis Deighton¹^ORCID,Cai Ceci Qing¹,Scott Sophie K.¹

Affiliation:

1. Institute of Cognitive Neuroscience, University College London, London, WC1N 3AZ, United Kingdom

Abstract

The amplitude of the speech signal varies over time, and the speech envelope is an attempt to characterise this variation in the form of an acoustic feature. Although tacitly assumed, the similarity between the speech envelope-derived time series and that of phonetic objects (e.g., vowels) remains empirically unestablished. The current paper, therefore, evaluates several speech envelope extraction techniques, such as the Hilbert transform, by comparing different acoustic landmarks (e.g., peaks in the speech envelope) with manual phonetic annotation in a naturalistic and diverse dataset. Joint speech tasks are also introduced to determine which acoustic landmarks are most closely coordinated when voices are aligned. Finally, the acoustic landmarks are evaluated as predictors for the temporal characterisation of speaking style using classification tasks. The landmark that performed most closely to annotated vowel onsets was peaks in the first derivative of a human audition-informed envelope, consistent with converging evidence from neural and behavioural data. However, differences also emerged based on language and speaking style. Overall, the results show that both the choice of speech envelope extraction technique and the form of speech under study affect how sensitive an engineered feature is at capturing aspects of speech rhythm, such as the timing of vowels.

Publisher

Acoustical Society of America (ASA)

Subject

Acoustics and Ultrasonics,Arts and Humanities (miscellaneous)

Link

https://asa.scitation.org/doi/pdf/10.1121/10.0009844

Reference102 articles.

1. A PHONETICIAN’S VIEW OF VERSE STRUCTURE

2. Automatic measurement of vowel duration via structured prediction

3. Cortical entrainment: what we can learn from studying naturalistic speech perception

4. Rhythm, Timing and the Timing of Rhythm

5. The usefulness of metrics in the quantification of speech rhythm

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On the speech envelope in the cortical tracking of speech;NeuroImage;2024-08

2. Perception of temporal structure in speech is influenced by body movement and individual beat perception ability;Attention, Perception, & Psychophysics;2024-05-20

3. Testing an acoustic model of the P-center in English and Japanese;The Journal of the Acoustical Society of America;2024-04-01

4. Application of multi-algorithm mixed feature extraction model in underwater acoustic signal;Ocean Engineering;2024-03

5. Neural decoding of the speech envelope: Effects of intelligibility and spectral degradation;2024-02-23