Analyzing Wav2Vec 1.0 Embeddings for Cross-Database Parkinson’s Disease Detection and Speech Features Extraction-Reference-Cited by-同舟云学术

Analyzing Wav2Vec 1.0 Embeddings for Cross-Database Parkinson’s Disease Detection and Speech Features Extraction

Published:2024-08-26 Issue:17 Volume:24 Page:5520
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Klempíř Ondřej¹^ORCID,Krupička Radim¹^ORCID

Affiliation:

1. Department of Biomedical Informatics, Faculty of Biomedical Engineering, Czech Technical University in Prague, 16000 Prague, Czech Republic

Abstract

Advancements in deep learning speech representations have facilitated the effective use of extensive unlabeled speech datasets for Parkinson’s disease (PD) modeling with minimal annotated data. This study employs the non-fine-tuned wav2vec 1.0 architecture to develop machine learning models for PD speech diagnosis tasks, such as cross-database classification and regression to predict demographic and articulation characteristics. The primary aim is to analyze overlapping components within the embeddings on both classification and regression tasks, investigating whether latent speech representations in PD are shared across models, particularly for related tasks. Firstly, evaluation using three multi-language PD datasets showed that wav2vec accurately detected PD based on speech, outperforming feature extraction using mel-frequency cepstral coefficients in the proposed cross-database classification scenarios. In cross-database scenarios using Italian and English-read texts, wav2vec demonstrated performance comparable to intra-dataset evaluations. We also compared our cross-database findings against those of other related studies. Secondly, wav2vec proved effective in regression, modeling various quantitative speech characteristics related to articulation and aging. Ultimately, subsequent analysis of important features examined the presence of significant overlaps between classification and regression models. The feature importance experiments discovered shared features across trained models, with increased sharing for related tasks, further suggesting that wav2vec contributes to improved generalizability. The study proposes wav2vec embeddings as a next promising step toward a speech-based universal model to assist in the evaluation of PD.

Funder

European Union – Next Generation EU

Publisher

MDPI AG

Link

https://www.mdpi.com/1424-8220/24/17/5520/pdf

Reference86 articles.

1. High-Performance Medicine: The Convergence of Human and Artificial Intelligence;Topol;Nat. Med.,2019

2. Opportunities and Obstacles for Deep Learning in Biology and Medicine;Ching;J. R. Soc. Interface,2018

3. Shehab, M., Abualigah, L., Shambour, Q., Abu-Hashem, M.A., Shambour, M.K.Y., Alsalibi, A.I., and Gandomi, A.H. (2022). Machine Learning in Medical Applications: A Review of State-of-the-Art Methods. Comput. Biol. Med., 145.

4. Deep Learning and Wearable Sensors for the Diagnosis and Monitoring of Parkinson’s Disease: A Systematic Review;Sigcha;Expert Syst. Appl.,2023

5. Shaban, M. (2023). Deep Learning for Parkinson’s Disease Diagnosis: A Short Survey. Computers, 12.