Speech retrieval from unsegmented finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval-Reference-Cited by-同舟云学术

Speech retrieval from unsegmented finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval

Published:2011-10 Issue:1 Volume:8 Page:1-25
ISSN:1550-4875
Container-title:ACM Transactions on Speech and Language Processing
language:en
Short-container-title:ACM Trans. Speech Lang. Process.

Author:

Turunen Ville T.¹,Kurimo Mikko¹

Affiliation:

1. Aalto University School of Science, Aalto, Finland

Abstract

This article examines the use of statistically discovered morpheme-like units for Spoken Document Retrieval (SDR). The morpheme-like units ( morphs ) are used both for language modeling in speech recognition and as index terms. Traditional word-based methods suffer from out-of-vocabulary words. If a word is not in the recognizer vocabulary, any occurrence of the word in speech will be missing from the transcripts. The problem is especially severe for languages with a high number of distinct word forms such as Finnish. With the morph language model, even previously unseen words can be recognized by identifying its component morphs. Similarly in information retrieval queries, complex word forms, even unseen ones, can be matched to data after segmenting them to morphs. Retrieval performance can be further improved by expanding the transcripts with alternative recognition results from confusion networks . In this article, a novel retrieval evaluation corpus consisting of unsegmented Finnish radio programs, 25 queries and corresponding human relevance assessments was constructed. Previous results on using morphs and confusion networks for Finnish SDR are confirmed and extended to the unsegmented case. As previously, using morphs or base forms as index terms yields about equal performance but combination methods, including a new one, are found to work better than either alone. Using alternative morph segmentations of the query words is found to further improve the results. Lexical similarity-based story segmentation was applied and performance using morphs, base forms, and their combinations was compared for the first time.

Publisher

Association for Computing Machinery (ACM)

Subject

Computational Mathematics,Computer Science (miscellaneous)

Link

https://dl.acm.org/doi/pdf/10.1145/2036916.2036917

Reference63 articles.

1. Arisoy E. Kurimo M. Saraçlar M. Hirsimäki T. Pylkkönen J. Alumäe T. and Sak H. 2008. Statistical language modeling for automatic speech recognition of agglutinative languages. In Speech Recognition Technologies and Applications F. Mihelic and J. Zibert Eds. I-Tech 193--204. Arisoy E. Kurimo M. Saraçlar M. Hirsimäki T. Pylkkönen J. Alumäe T. and Sak H. 2008. Statistical language modeling for automatic speech recognition of agglutinative languages. In Speech Recognition Technologies and Applications F. Mihelic and J. Zibert Eds. I-Tech 193--204.

2. Topic segmentation with an aspect hidden Markov model

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2023

2. A Comparative Study of Minimally Supervised Morphological Segmentation;Computational Linguistics;2016-03

3. Results for Variable Speaker and Recording Conditions on Spoken IR in Finnish;Speech and Computer;2013