Abstract
AbstractResearch on speech technologies necessitates spoken data, which is usually obtained through read recorded speech, and specifically adapted to the research needs. When the aim is to deal with the prosody involved in speech, the available data must reflect natural and conversational speech, which is usually costly and difficult to get. This paper presents a machine learning-oriented toolkit for collecting, handling, and visualization of speech data, using prosodic heuristic. We present two corpora resulting from these methodologies: PANTED corpus, containing 250 h of English speech from TED Talks, and Heroes corpus containing 8 h of parallel English and Spanish movie speech. We demonstrate their use in two deep learning-based applications: punctuation restoration and machine translation. The presented corpora are freely available to the research community.
Funder
Spanish Ministry of Economy, Industry and Competitiveness, through the Ramón y Cajal
Universitat Pompeu Fabra
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Linguistics and Language,Education,Language and Linguistics
Reference58 articles.
1. Adami, A. G., Mihaescu, R., Reynolds, D. A., & Godfrey, J. J. (2003). Modeling prosodic dynamics for speaker recognition. In 2003 IEEE international conference on acoustics, speech, and signal processing, 2003. Proceedings (ICASSP’03) (Vol. 4, pp. IV-788). IEEE.
2. Almeman, K., Lee, M., & Almiman, A.A. (2013). Multi dialect Arabic speech parallel corpora. In 1st International conference on communications, signal processing, and their applications (ICCSPA) (pp. 1–6). IEEE.
3. Avanzi, M., Lacheret-Dujour, A., & Victorri, B. (2008). ANALOR. A tool for semi-automatic annotation of French prosodic structure.
4. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. CoRR.
5. Batista, F., Moniz, H., Trancoso, I., & Mamede, N. (2012). Bilingual experiments on automatic recovery of capitalization and punctuation of automatic speech transcripts. IEEE Transactions on Audio, Speech, and Language Processing, 20(2), 474–485.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Abstract Extraction Algorithm on Account of Parallel Corpus in English Teaching;2022 International Conference on Knowledge Engineering and Communication Systems (ICKES);2022-12-28