Unsupervised Phoneme Segmentation Based on Main Energy Change for Arabic Speech-Reference-Cited by-同舟云学术

Unsupervised Phoneme Segmentation Based on Main Energy Change for Arabic Speech

Published:2017-03-30 Issue:2017 Volume:1 Page:12-20
ISSN:1509-4553
Container-title:Journal of Telecommunications and Information Technology
language:
Short-container-title:JTIT

Author:

Lachachi Noureddine

Abstract

In this paper, a new method for segmenting speech at the phoneme level is presented. For this purpose, author uses the short-time Fourier transform of the speech signal. The goal is to identify the locations of main energy changes in frequency over time, which can be described as phoneme boundaries. A frequency range analysis and search for energy changes in individual area is applied to obtain further precision to identify speech segments that carry out vowel and consonant segment confined in small number of narrow spectral areas. This method merely utilizes the power spectrum of the signal for segmentation. There is no need for any adaptation of the parameters or training for different speakers in advance. In addition, no transcript information, neither any prior linguistic knowledge about the phonemes is needed, or voiced/unvoiced decision making is required. Segmentation results with proposed method have been compared with a manual segmentation, and compared with three same kinds of segmentation methods. These results show that 81% of the boundaries are successfully identified. This research aims to improve the acoustic parameters for all the processing systems of the Arab speech.

Publisher

National Institute of Telecommunications

Reference35 articles.

1. [1] K. Vicsi and D. Sztahó, “Recognition of emotions on the basis of diﬀerent levels of speech segments”, J. of Adv. Comput. Intell. and Intelligent Inform., vol. 16, no. 2, pp. 335–340, 2012.

2. [2] K. Vicsi, D. Sztahó, and G. Kiss, “Examination of the sensitivity of acoustic-phonetic parameters of speech to depression”, in Proc. 3rd IEEE Int. Conf. on Cognitive Infocommun. CogInfoCom 2012, Kosice, Slovakia, 2012, pp. 511–515 (doi: 10.1109/CogInfoCom.2012.6422035).

3. [3] K. Vicsi, V. Imre, and G. Kiss, “Improving the classiﬁcation of healthy and pathological continuous speech”, in Proc. 15th Int. Conf. Text, Speech and Dialogue TSD 2012, Brno, Czech Republic, 2012, pp. 581–588.

4. [4] J. P. Goldman, “EasyAlign: An automatic phonetic alignment tool under Praat”, in Proc. 12th Ann. Conf. of the Int. Speech Commun. Assoc. Interspeech 2011, Florence, Italy, 2011.

5. [5] B. Bigi and D. Hirst, “Speech phonetization alignment and syllabication (SPPAS): A tool for the automatic analysis of speech prosody”, in Proc. 6th Int. Conf. Speech Prosody, Shanghai, China, 2012.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unsupervised phoneme segmentation of continuous Arabic speech;International Journal of Speech Technology;2024-05-02