Acoustic Correlates of the Syllabic Rhythm of Speech: Modulation Spectrum or Local Features of the Temporal Envelope

Author:

Zhang YuranORCID,Zou Jiajie,Ding Nai

Abstract

AbstractThe speech envelope is considered as a major acoustic correlate of the syllable rhythm since the peak frequency in the speech modulation spectrum matches the mean syllable rate. Nevertheless, it has not been quantified whether the peak modulation frequency can track the syllable rate of individual utterances and how much variance of the speech envelope can be explained by the syllable rhythm. Here, we address these problems by analyzing large speech corpora (>1000 hours of recording of multiple languages) using advanced sequence-to-sequence modeling. It is found that, only when averaged over minutes of speech recordings, the peak modulation frequency of speech reliably correlates with the syllable rate of a speaker. In contrast, the phase-locking between speech envelope and syllable onsets is robustly observed within a few seconds of recordings. Based on speaker-independent linear and nonlinear models, the timing of syllable onsets explains about 13% and 46% variance of the speech envelope, respectively. These results demonstrate that local temporal features in the speech envelope precisely encodes the syllable onsets but the modulation spectrum is not always dominated by the syllable rhythm.

Publisher

Cold Spring Harbor Laboratory

Reference63 articles.

1. The Effect of Speaking Rate onConsonant Vowel Coarticulation

2. Syllabic reduction in Mandarin and English speech;The Journal of the Acoustical Society of America,2014

3. Ardila, R. , Branson, M. , Davis, K. , Henretty, M. , Kohler, M. , Meyer, J. , … Weber, G. (2020). Common Voice: A Massively-Multilingual Speech Corpus. Proceedings of the 12th International Conference on Language Resources and Evaluation (Lrec 2020), 4218–4222.

4. The coupling between auditory and motor cortices is rate-restricted: Evidence for an intrinsic speech-motor rhythm

5. Talkers produce more pronounced amplitude modulations when speaking in noise

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3