Advancing Paraphasia Detection with End-to-End Learning: A Comparative Approach Study (Preprint)

Author:

Perez MatthewORCID,Le DucORCID,Romana Amrit,Jones Elise,Licata Keli,Provost Emily MowerORCID

Abstract

BACKGROUND

Paraphasias are speech errors that are often characteristic of aphasia and they represent an important signal in assessing disease severity and subtype. Traditionally, clinicians manually identify paraphasias by transcribing and analyzing speech-language samples, which can be a time-consuming and burdensome process. Automatic paraphasia detection can greatly help clinicians with the transcription process and ultimately facilitate more efficient and consistent aphasia assessment.

OBJECTIVE

This study investigates a novel machine learning framework for automatic paraphasia detection that is trained end-to-end (i.e., a unified network that takes speech audio as input and outputs text that indicates what was said and identifies which words are paraphasias). We use the AphasiaBank corpus, which contains audio data collected from persons with aphasia (PWAs) that has been transcribed and labeled with paraphasias by trained speech-language pathologists.

METHODS

We propose a novel sequence-to-sequence (seq2seq) architecture for performing both automatic speech recognition (ASR) and paraphasia detection tasks. We explore the impact of leveraging pretrained speech models as well as different learning objectives for optimizing this model. This approach can be advantageous in learning synergistic representations that benefit both ASR and paraphasia detection tasks. We compare against a previous state-of-the art method that uses a multi-step pipeline approach consisting of ASR, hand-engineered feature extraction, and paraphasia detection.

RESULTS

We show that the proposed seq2seq is able to outperform the multi-step pipeline approach for word-level and utterance-level paraphasia detection. We achieve word-level performance improvements of 16.9%, 36.4%, and 9.5% and utterance-level improvements of 5.2%, 13.9%, 18.9% for phonemic, neologistic, and phonemic+neologistic paraphasias, respectively.

CONCLUSIONS

These results highlight the performance improvements of learning to detect paraphasias end-to-end rather than through a multi-step pipeline approach with separate ASR and paraphasia detection models. The advantage of learning both ASR and paraphasia detection tasks end-to-end is that this unified model can learn joint representations that are beneficial to both ASR and paraphasia detection tasks rather than optimizing both of these separately. Future work will explore the efficacy of a deployed paraphasia detection model at assisting medical professionals with annotation.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3