Abstract
The so-called ‘mismatch’ is a factor which experts in the forensic voice comparison field encounter regularly. Therefore, we decided to explore to what extent the automatic-speaker-recognition system’s and the earwitness’ ability to identify speakers is influenced when recordings are acquired in different languages and at different times. 100 voices in a database of 300 recordings (100 speakers recorded in three mutually mismatched sessions) were compared with an automatic-speaker-recognition software VOCALISE based on i-vectors and x-vectors, and by 39 respondents in simulated voice parades. Both the automatic and perceptual approach seem to have yielded similar results in that the less complex the mismatch type, the more successful the identification. The results point to the superiority of the x-vector approach, and also to varying identification abilities of listeners.
Publisher
Charles University in Prague, Karolinum Press
Subject
General Engineering,Energy Engineering and Power Technology
Reference35 articles.
1. Aural and automatic forensic speaker recognition in mismatched conditions
2. Fitting Linear Mixed-Effects Models Usinglme4
3. Bortlík, J. F. (2021). Czech accent in English: Linguistics and biometric speech technologies. Palacký University Olomouc. (unpublished PhD dissertation)
4. de Jong-Lendle, G., Nolan, F., McDougall, K., & Hudson, T. (2015). Voice lineups: A practical guide. In: Proceedings of ICPhS 2015, paper 0598.
5. Examining the implications of speech accommodation for forensic speaker comparison casework: A case study of the West Yorkshire face vowel