Author:
EBDEN PETER,SPROAT RICHARD
Abstract
AbstractThis paper describes the Kestrel text normalization system, a component of the Google text-to-speech synthesis (TTS) system. At the core of Kestrel are text-normalization grammars that are compiled into libraries of weighted finite-state transducers (WFSTs). While the use of WFSTs for text normalization is itself not new, Kestrel differs from previous systems in its separation of the initialtokenization and classificationphase of analysis fromverbalization. Input text is first tokenized and different tokens classified using WFSTs. As part of the classification, detectedsemiotic classes– expressions such as currency amounts, dates, times, measure phases, are parsed into protocol buffers (https://code.google.com/p/protobuf/). The protocol buffers are then verbalized, with possible reordering of the elements, again using WFSTs. This paper describes the architecture of Kestrel, the protocol buffer representations of semiotic classes, and presents some examples of grammars for various languages. We also discuss applications and deployments of Kestrel as part of the Google TTS system, which runs on both server and client side on multiple devices, and is used daily by millions of people in nineteen languages and counting.
Publisher
Cambridge University Press (CUP)
Subject
Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software
Reference36 articles.
1. Text-to-Speech Synthesis
2. Tai T. , Skut W. , and Sproat R. 2011. Thrax: an open source grammar compiler built on OpenFst. In Automatic Speech Recognition and Understanding Workshop, Waikoloa Resort, Hawaii.
3. Lightly supervised learning of text normalization: Russian number names
4. A Generic Finite State Compiler for Tagging Rules
Cited by
38 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Chat about Boring Problems: Studying GPT-Based Text Normalization;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
2. A Unified Front-End Framework for English Text-to-Speech Synthesis;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
3. TNFormer: Single-Pass Multilingual Text Normalization with a Transformer Decoder Model;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
4. Bangla text normalization for text-to-speech synthesizer using machine learning algorithms;Journal of King Saud University - Computer and Information Sciences;2024-01
5. Adopting Neural Translation Model in Data Generation for Inverse Text Normalization;2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC);2023-10-31