Comparison of performance of automatic recognizers for stutters in speech trained with event or interval markers-Reference-Cited by-同舟云学术

Comparison of performance of automatic recognizers for stutters in speech trained with event or interval markers

Published:2024-02-27 Issue: Volume:15 Page:
ISSN:1664-1078
Container-title:Frontiers in Psychology
language:
Short-container-title:Front. Psychol.

Author:

Barrett Liam,Tang Kevin,Howell Peter

Abstract

IntroductionAutomatic recognition of stutters (ARS) from speech recordings can facilitate objective assessment and intervention for people who stutter. However, the performance of ARS systems may depend on how the speech data are segmented and labelled for training and testing. This study compared two segmentation methods: event-based, which delimits speech segments by their fluency status, and interval-based, which uses fixed-length segments regardless of fluency.MethodsMachine learning models were trained and evaluated on interval-based and event-based stuttered speech corpora. The models used acoustic and linguistic features extracted from the speech signal and the transcriptions generated by a state-of-the-art automatic speech recognition system.ResultsThe results showed that event-based segmentation led to better ARS performance than interval-based segmentation, as measured by the area under the curve (AUC) of the receiver operating characteristic. The results suggest differences in the quality and quantity of the data because of segmentation method. The inclusion of linguistic features improved the detection of whole-word repetitions, but not other types of stutters.DiscussionThe findings suggest that event-based segmentation is more suitable for ARS than interval-based segmentation, as it preserves the exact boundaries and types of stutters. The linguistic features provide useful information for separating supra-lexical disfluencies from fluent speech but may not capture the acoustic characteristics of stutters. Future work should explore more robust and diverse features, as well as larger and more representative datasets, for developing effective ARS systems.

Publisher

Frontiers Media SA

Reference36 articles.

1. Systematic review of machine learning approaches for detecting developmental stuttering;Barrett;IEEE/ACM Trans Audio Speech Lang Process,2022

2. The Influence of Dataset Partitioning on Dysfluency Detection Systems