MAKEDONKA: Applied Deep Learning Model for Text-to-Speech Synthesis in Macedonian Language-Reference-Cited by-同舟云学术

MAKEDONKA: Applied Deep Learning Model for Text-to-Speech Synthesis in Macedonian Language

Published:2020-10-01 Issue:19 Volume:10 Page:6882
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Mishev Kostadin^ORCID,Karovska Ristovska Aleksandra^ORCID,Trajanov Dimitar^ORCID,Eftimov Tome^ORCID,Simjanoska Monika^ORCID

Abstract

This paper presents MAKEDONKA, the first open-source Macedonian language synthesizer that is based on the Deep Learning approach. The paper provides an overview of the numerous attempts to achieve a human-like reproducible speech, which has unfortunately shown to be unsuccessful due to the work invisibility and lack of integration examples with real software tools. The recent advances in Machine Learning, the Deep Learning-based methodologies, provide novel methods for feature engineering that allow for smooth transitions in the synthesized speech, making it sound natural and human-like. This paper presents a methodology for end-to-end speech synthesis that is based on a fully-convolutional sequence-to-sequence acoustic model with a position-augmented attention mechanism—Deep Voice 3. Our model directly synthesizes Macedonian speech from characters. We created a dataset that contains approximately 20 h of speech from a native Macedonian female speaker, and we use it to train the text-to-speech (TTS) model. The achieved MOS score of 3.93 makes our model appropriate for application in any kind of software that needs text-to-speech service in the Macedonian language. Our TTS platform is publicly available for use and ready for integration.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/10/19/6882/pdf

Reference55 articles.

1. A Review of Deep Learning Based Speech Synthesis

2. Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploring ASR Models in Low-Resource Languages: Use-Case the Macedonian Language;Communications in Computer and Information Science;2023

2. Macedonian Speech Synthesis for Assistive Technology Applications;2022 30th European Signal Processing Conference (EUSIPCO);2022-08-29

3. Assistive e-Learning Software Modules to Aid Education Process of Students with Visual and Hearing Impairment: A Case Study in North Macedonia;Communications in Computer and Information Science;2022