Speech Recognition for Task Domains with Sparse Matched Training Data-Reference-Cited by-同舟云学术

Speech Recognition for Task Domains with Sparse Matched Training Data

Published:2020-09-04 Issue:18 Volume:10 Page:6155
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Kang Byung Ok,Jeon Hyeong Bae,Park Jeon Gue

Abstract

We propose two approaches to handle speech recognition for task domains with sparse matched training data. One is an active learning method that selects training data for the target domain from another general domain that already has a significant amount of labeled speech data. This method uses attribute-disentangled latent variables. For the active learning process, we designed an integrated system consisting of a variational autoencoder with an encoder that infers latent variables with disentangled attributes from the input speech, and a classifier that selects training data with attributes matching the target domain. The other method combines data augmentation methods for generating matched target domain speech data and transfer learning methods based on teacher/student learning. To evaluate the proposed method, we experimented with various task domains with sparse matched training data. The experimental results show that the proposed method has qualitative characteristics that are suitable for the desired purpose, it outperforms random selection, and is comparable to using an equal amount of additional target domain data.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/10/18/6155/pdf

Reference45 articles.

1. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups

2. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition

3. Language-dependent state clustering for multilingual acoustic modelling

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. AI‐based language tutoring systems with end‐to‐end automatic speech recognition and proficiency evaluation;ETRI Journal;2024-01-31

2. Influence of Highly Inflected Word Forms and Acoustic Background on the Robustness of Automatic Speech Recognition for Human–Computer Interaction;Mathematics;2022-02-24

3. Multimodal Unsupervised Speech Translation for Recognizing and Evaluating Second Language Speech;Applied Sciences;2021-03-16