Alzheimer’s Dementia Speech (Audio vs. Text): Multi-Modal Machine Learning at High vs. Low Resolution

Author:

Priyadarshinee Prachee1ORCID,Clarke Christopher Johann1ORCID,Melechovsky Jan1ORCID,Lin Cindy Ming Ying1ORCID,B. T. Balamurali1ORCID,Chen Jer-Ming1ORCID

Affiliation:

1. Science, Mathematics and Technology, Singapore University of Technology and Design, Singapore 487372, Singapore

Abstract

Automated techniques to detect Alzheimer’s Dementia through the use of audio recordings of spontaneous speech are now available with varying degrees of reliability. Here, we present a systematic comparison across different modalities, granularities and machine learning models to guide in choosing the most effective tools. Specifically, we present a multi-modal approach (audio and text) for the automatic detection of Alzheimer’s Dementia from recordings of spontaneous speech. Sixteen features, including four feature extraction methods (Energy–Time plots, Keg of Text Analytics, Keg of Text Analytics-Extended and Speech to Silence ratio) not previously applied in this context were tested to determine their relative performance. These features encompass two modalities (audio vs. text) at two resolution scales (frame-level vs. file-level). We compared the accuracy resulting from these features and found that text-based classification outperformed audio-based classification with the best performance attaining 88.7%, surpassing other reports to-date relying on the same dataset. For text-based classification in particular, the best file-level feature performed 9.8% better than the frame-level feature. However, when comparing audio-based classification, the best frame-level feature performed 1.4% better than the best file-level feature. This multi-modal multi-model comparison at high- and low-resolution offers insights into which approach is most efficacious, depending on the sampling context. Such a comparison of the accuracy of Alzheimer’s Dementia classification using both frame-level and file-level granularities on audio and text modalities of different machine learning models on the same dataset has not been previously addressed. We also demonstrate that the subject’s speech captured in short time frames and their dynamics may contain enough inherent information to indicate the presence of dementia. Overall, such a systematic analysis facilitates the identification of Alzheimer’s Dementia quickly and non-invasively, potentially leading to more timely interventions and improved patient outcomes.

Funder

SUTD Growth Plan

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Reference62 articles.

1. O1–02–01: Forecasting the global prevalence and burden of Alzheimer’s disease;Brookmeyer;Alzheimer Dement.,2007

2. A longitudinal study of language decline in Alzheimer’s disease and frontotemporal dementia;Blair;J. Int. Neuropsychol. Soc.,2007

3. Acoustic markers associated with impairment in language processing in Alzheimer’s disease;Carro;Span. J. Psychol.,2012

4. Language disorders in dementia of the Alzheimer type;Murdoch;Brain Lang.,1987

5. Speech and language impairments in dementia;Klimova;J. Appl. Biomed.,2016

Cited by 5 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Understanding Dementia Speech: Towards an Adaptive Voice Assistant for Enhanced Communication;Companion of the16th ACM SIGCHI Symposium on Engineering Interactive Computing Systems;2024-06-24

2. Performance Assessment of ChatGPT versus Bard in Detecting Alzheimer’s Dementia;Diagnostics;2024-04-15

3. A study on Multimodal approach for early detection of Dementia using Deep Learning;2024 IEEE International Conference for Women in Innovation, Technology & Entrepreneurship (ICWITE);2024-02-16

4. Transferring Speech-Generic and Depression-Specific Knowledge for Alzheimer’s Disease Detection;2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU);2023-12-16

5. Dementia Speech Dataset Creation and Analysis in Indic Languages—A Pilot Study;IEEE Access;2023

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3