Phase characteristics of vocal tract filter can distinguish speakers-Reference-Cited by-同舟云学术

Phase characteristics of vocal tract filter can distinguish speakers

Published:2023-12-08 Issue: Volume:9 Page:
ISSN:2297-4687
Container-title:Frontiers in Applied Mathematics and Statistics
language:
Short-container-title:Front. Appl. Math. Stat.

Author:

Okada Masahiro,Ito Hiroshi

Abstract

IntroductionSpeaker recognition has been performed by considering individual variations in the power spectrograms of speech, which reflect the resonance phenomena in the speaker's vocal tract filter. In recent years, phase-based features have been used for speaker recognition. However, the phase-based features are not in a raw form of the phase but are crafted by humans, suggesting that the role of the raw phase is less interpretable. This study used phase spectrograms, which are calculated by subtracting the phase in the time-frequency domain of the electroglottograph signal from that of speech. The phase spectrograms represent the non-modified phase characteristics of the vocal tract filter.MethodsThe phase spectrograms were obtained from five Japanese participants. Phase spectrograms corresponding to vowels, called phase spectra, were then extracted and circular-averaged for each vowel. The speakers were determined based on the degree of similarity of the averaged spectra.ResultsThe accuracy of discriminating speakers using the averaged phase spectra was observed to be high although speakers were discriminated using only phase information without power. In particular, the averaged phase spectra showed different shapes for different speakers, resulting in the similarity between the different speaker spectrum pairs being lower. Therefore, the speakers were distinguished by using phase spectra.DiscussionThis predominance of phase spectra suggested that the phase characteristics of the vocal tract filter reflect the individuality of speakers.

Funder

Japan Society for the Promotion of Science

Publisher

Frontiers Media SA

Subject

Applied Mathematics,Statistics and Probability

Reference28 articles.

1. Front-end factor analysis for speaker verification;Dehak;IEEE Trans Audio Speech Lang Process,2011

2. X-vectors: robust DNN embeddings for speaker recognition;Snyder,2018

3. Machine Learning for Speaker Recognition

4. Representation of complex spectrogram via phase conversion;Yatabe;Acoust Sci Technol,2019

5. An investigation of the effectiveness of phase for audio classification;Hidaka;ICASSP 2022,2022