Speech Synthesis of Children’s Reading Based on CycleGAN Model-Reference-Cited by-同舟云学术

Speech Synthesis of Children’s Reading Based on CycleGAN Model

Published:2020-08-01 Issue:1 Volume:1607 Page:012046
ISSN:1742-6588
Container-title:Journal of Physics: Conference Series
language:
Short-container-title:J. Phys.: Conf. Ser.

Author:

Jia Ning,Zheng Chunjun,Sun Wei

Abstract

Abstract The generation of emotional speech is a challenging and widely applied research topic in the field of speech processing. Because the design method of effective speech feature expression and generation model directly affects the accuracy of emotional speech generation, it is difficult to find a general solution of emotional speech synthesis. In this paper, the CycleGAN model is used as the starting point, and the improved convolution neural network (CNN) model and identity mapping loss scheme are used to achieve effective timing information capture. At the same time, we learn the positive mapping and the reverse mapping to find the best matching design scheme, and retain the speech information in this process, without relying on other audio data. Experiments show that the emotional speech can be accurately recognized by comparing the speech emotion before and after the improvement on the speech corpus of children’s reading. By comparing with the common emotional speech generation model, the advantages of the model proposed in this paper are verified.

Publisher

IOP Publishing

Subject

General Physics and Astronomy

Link

https://iopscience.iop.org/article/10.1088/1742-6596/1607/1/012046/pdf

Reference12 articles.

1. Multimodal Speech Synthesis Architecture for Unsupervised Speaker Adaptation[C];Luong,2018

2. LSTM-Based Robust Voicing Decision Applied to DNN-Based Speech Synthesis[J];Pradeep;Automatic Control and Computer Sciences,2019

3. Speech emotion recognition using emotion perception spectral feature[J];Jiang;Concurrency and Computation Practice and Experience,2019

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Build a 50+ Hours Chinese Mandarin Corpus for Children’s Speech Recognition;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

2. Enhanced AI Model to Improve Child Speech Recognition;Journal of Digital Contents Society;2024-02-28

3. Impersonated Human Speech Chatbot with Adaptive Frequency Spectrum;2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT);2023-01-19

4. A WAV2VEC2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition;IEEE Access;2023