Review of methods for coding of speech signals-Reference-Cited by-同舟云学术

Review of methods for coding of speech signals

Published:2023-02-07 Issue:1 Volume:2023 Page:
ISSN:1687-4722
Container-title:EURASIP Journal on Audio, Speech, and Music Processing
language:en
Short-container-title:J AUDIO SPEECH MUSIC PROC.

Author:

O’Shaughnessy Douglas^ORCID

Abstract

AbstractSpeech is the most common form of human communication, and many conversations use digital communication links. For efficient transmission, acoustic speech waveforms are usually converted to digital form, with reduced bit rates, while maintaining decoded speech quality. This paper reviews the history of speech coding techniques, from early mu-law logarithmic compression to recent neural-network methods. The techniques are examined in terms of output quality, algorithmic complexity, delay, and cost. Focus is on which aspects of speech can be exploited for high-quality transmission. The choices made to code speech are motivated by efficiency, the needs of applications, and access to information in the speech signal that is useful for both intelligibility and naturalness in the reconstructed speech at the decoder.

Funder

NSERC

Publisher

Springer Science and Business Media LLC

Subject

Electrical and Electronic Engineering,Acoustics and Ultrasonics

Link

https://link.springer.com/content/pdf/10.1186/s13636-023-00274-x.pdf

Reference139 articles.

1. M. Anthony, P.L. Bartlett, P.L. Bartlett, Neural network learning: theoretical foundations, vol 9 (Cambridge University Press, 1999)

2. J.G. Beerends, C. Schmidmer, J. Berger, M. Obermann, R. Ullmann, J. Pomy, M. Keyhl, Perceptual objective listening quality assessment (POLQA), the third generation ITU-T standard for end-to-end speech quality measurement part i—temporal alignment. J. Audio Engineer. Soc. 61(6), 366–384 (2013)

3. B. Bessette, R. Salami, R. Lefebvre, M. Jelinek, J. Rotola-Pukkila, J. Vainio, H. Mikkola, K. Jarvinen, The adaptive multi- rate wideband speech codec (AMR-WB). IEEE .Transact Speech Audio Process 10(8), 620–636 (2002)

4. X. Bie, L. Girin, S. Leglaive, T. Hueber, X. Alameda-Pineda, A benchmark of dynamical variational autoencoders applied to speech spectrogram modeling. Interspeech, 46–50 (2021)

5. K. Brandenburg, in Audio Engineering Society Conference: 17th International Conference: High-Quality Audio Coding. MP3 and AAC explained (Audio Engineering Society, 1999)

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Noise robust speech encoding system in challenging acoustic conditions;International Journal of Speech Technology;2024-07-06

2. Method for testing the stability of an autoregressive model of the vocal tract and adjusting its parameters;Izmeritel`naya Tekhnika;2024-06-10

3. ASQ: An Ultra-Low Bit Rate ASR-Oriented Speech Quantization Method;IEEE Signal Processing Letters;2024

4. Multi-Agent Deep Learning for the Detection of Multiple Speech Steganography Methods;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

5. Waveform based speech coding using nonlinear predictive techniques: a systematic review;International Journal of Speech Technology;2023-12