The effect of speech pathology on automatic speaker verification: a large-scale study-Reference-Cited by-同舟云学术

The effect of speech pathology on automatic speaker verification: a large-scale study

Published:2023-11-22 Issue:1 Volume:13 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Tayebi Arasteh Soroosh,Weise Tobias,Schuster Maria,Noeth Elmar,Maier Andreas,Yang Seung Hee

Abstract

AbstractNavigating the challenges of data-driven speech processing, one of the primary hurdles is accessing reliable pathological speech data. While public datasets appear to offer solutions, they come with inherent risks of potential unintended exposure of patient health information via re-identification attacks. Using a comprehensive real-world pathological speech corpus, with over n

$$=$$

= 3800 test subjects spanning various age groups and speech disorders, we employed a deep-learning-driven automatic speaker verification (ASV) approach. This resulted in a notable mean equal error rate (EER) of

$$0.89 \pm 0.06 \%$$

0.89 ± 0.06 % , outstripping traditional benchmarks. Our comprehensive assessments demonstrate that pathological speech overall faces heightened privacy breach risks compared to healthy speech. Specifically, adults with dysphonia are at heightened re-identification risks, whereas conditions like dysarthria yield results comparable to those of healthy speakers. Crucially, speech intelligibility does not influence the ASV system’s performance metrics. In pediatric cases, particularly those with cleft lip and palate, the recording environment plays a decisive role in re-identification. Merging data across pathological types led to a marked EER decrease, suggesting the potential benefits of pathological diversity in ASV, accompanied by a logarithmic boost in ASV effectiveness. In essence, this research sheds light on the dynamics between pathological speech and speaker verification, emphasizing its crucial role in safeguarding patient confidentiality in our increasingly digitized healthcare era.

Funder

Friedrich-Alexander-Universität Erlangen-Nürnberg

Medical Valley e.V.

Siemens Healthineers

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

https://www.nature.com/articles/s41598-023-47711-7.pdf

Reference59 articles.

1. Rios-UrregoEmail, C., Vásquez-Correa, J., Orozco-Arroyave, J. & Nöth, E. Is there any additional information in a neural network trained for pathological speech classification? In Proc. 24th International Conference on Text, Speech, and Dialogue, Olomouc, Czech Republic, 435–447, https://doi.org/10.1007/978-3-030-83527-9_37 (Springer Nature, 2021).