Grammar-Supervised End-to-End Speech Recognition with Part-of-Speech Tagging and Dependency Parsing-Reference-Cited by-同舟云学术

Grammar-Supervised End-to-End Speech Recognition with Part-of-Speech Tagging and Dependency Parsing

Published:2023-03-27 Issue:7 Volume:13 Page:4243
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Wan Genshun¹²^ORCID,Mao Tingzhi²,Zhang Jingxuan²^ORCID,Chen Hang¹,Gao Jianqing²,Ye Zhongfu¹

Affiliation:

1. National Engineering Research Center of Speech and Language Information Processing, University of Science and Technology of China, Hefei 230088, China

2. iFLYTEK Research, iFLYTEK Co., Ltd., Hefei 230088, China

Abstract

For most automatic speech recognition systems, many unacceptable hypothesis errors still make the recognition results absurd and difficult to understand. In this paper, we introduce the grammar information to improve the performance of the grammatical deviation distance and increase the readability of the hypothesis. The reinforcement of word embedding with grammar embedding is presented to intensify the grammar expression. An auxiliary text-to-grammar task is provided to improve the performance of the recognition results with the downstream task evaluation. Furthermore, the multiple evaluation methodology of grammar is used to explore an expandable usage paradigm with grammar knowledge. Experiments on the small open-source Mandarin speech corpus AISHELL-1 and large private-source Mandarin speech corpus TRANS-M tasks show that our method can perform very well with no additional data. Our method achieves relative character error rate reductions of 3.2% and 5.0%, a relative grammatical deviation distance reduction of 4.7% and 5.9% on AISHELL-1 and TRANS-M tasks, respectively. Moreover, the grammar-based mean opinion score of our method is about 4.29 and 3.20, significantly superior to the baseline of 4.11 and 3.02.

Funder

National Key R & D Program of China

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/7/4243/pdf

Reference36 articles.

1. Dong, P., Wang, S., Niu, W., Zhang, C., Lin, S., Li, Z., Gong, Y., Ren, B., Lin, X., and Tao, D. (2020, January 20–24). RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition. Proceedings of the 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA.

2. Chenxuan, H. (2021, January 19–21). Research on Speech Recognition Technology for Smart Home. Proceedings of the 2021 IEEE 4th International Conference on Automation, Electronics and Electrical Engineering (AUTEEE), Shenyang, China.

3. Sathyendra, K.M., Muniyappa, T., Chang, F.-J., Liu, J., Su, J., Strimel, G.P., Mouchtaris, A., and Kunzmann, S. (2022, January 23–27). Contextual Adapters for Personalized Speech Recognition in Neural Transducers. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore.

4. Baevski, A., Zhou, Y., Mohamed, A., and Auli, M. (2020, January 6–12). wav2vec 2.0: A framework for self-supervised learning of speech representations. Proceedings of the Advances in Neural Information Processing Systems 33, Virtual.

5. Li, B., Chang, S.-Y., Sainath, T.N., Pang, R., He, Y., Strohman, T., and Wu, T. (2020, January 4–8). Towards Fast and Accurate Streaming End-To-End ASR. Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A neural machine translation method based on split graph convolutional self-attention encoding;PeerJ Computer Science;2024-02-28

2. Sentiment Analysis and Topic Modeling of E-Grocery Application Reviews Using Naive Bayes and Support Vector Machine: A Case Study of Segari Data Review on the Google Play Store;2023 3rd International Conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS);2023-08-09