Affiliation:
1. College of Air Traffic Management, Civil Aviation Flight University of China, Guanghan 618307, China
Abstract
In recent years, automatic speech recognition (ASR) technology has improved significantly. However, the training process for an ASR model is complex, involving large amounts of data and a large number of algorithms. The task of training a new model for air traffic control (ATC) is considerable, as it may require many researchers for its maintenance and upgrading. In this paper, we developed an improved fusion method that can adapt the language model (LM) in ASR to the domain of air traffic control. Instead of using vocabulary in traditional fusion, this method uses the ATC instructions to improve the LM. The perplexity shows that the LM of the improved fusion is much better than that of the use of vocabulary. With vocabulary fusion, the CER in the ATC corpus decreases from 0.3493 to 0.2876. The improved fusion reduces the CER of the ATC corpora from 0.3493 to 0.2761. Although there is only a difference of less than 2% between the two fusions, the perplexity shows that the LM of the improved fusion is much better.
Funder
National Key R&D Program of China
Reference27 articles.
1. Hawkins, F.H. (1993). Human Factors in Flight, Routledge. [2nd ed.].
2. Air traffic control speech recognition system cross-task and speaker adaptation;Ferreiros;IEEE Aerosp. Electron. Syst. Mag.,2006
3. Guo, D., Zhang, Z., Fan, P., Zhang, J., and Yang, B. (2021). A Context-Aware Language Model to Improve the Speech Recognition in Air Traffic Control. Aerospace, 8.
4. Zhang, S., Kong, J., Chen, C., Li, Y., and Liang, H. (2022). Speech GAU: A Single Head Attention for Mandarin Speech Recognition for Air Traffic Control. Aerospace, 9.
5. Deep transfer learning for automatic speech recognition: Towards better generalization;Kheddar;Knowl. Based Syst.,2023