Author:
Gamermann Ronaldo Wajnberg,Maneguello Fabio Junior
Abstract
Aviation safety reports are essential sources for the identification and analysis of risks in civil aviation. These reports are written in plain language, which requires the application of Natural Language Processing techniques for automatic and intelligent treatment. In the case of Brazil, the vast majority of reports are written in Portuguese. Therefore, for comparison with international database of reports that are written in English, a first step is the translation of Brazilian reports. In this work, a proposal for a machine translation model is presented based on the fine-tuning of pre-trained models. To this end, an aviation-specific language corpus is developed with the objective of generating example data for model training. Finally, a pre-trained model is fine-tuned with the corpus created in order to implement an automatic translation model that achieves good results in the task considering the specifics of the aviation language. As a result, a first model is implemented, presenting coherent results of translation between Portuguese/English in the specific domain of aviation.
Publisher
South Florida Publishing LLC
Reference13 articles.
1. AUSTRALIAN TRANSPORT SAFETY BUREAU [ATSB].National Aviation Occurence Database. Disponível em: https://www.atsb.gov.au/avdata/terminology. Acesso em: 20 set. 2022.
2. BENDER, E. Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax. Synthesis Lectures on Human Language Technologies. 1-184. 2013.
3. BIRD, S.; KLEIN, E.; LOPER, E. Natural Language Processing with Python. 1ed. O’Reilly Media. Sebastopol. Rússia. 2009.
4. GOOGLE.Evaluation Models. Disponível em: https://cloud.google.com/translate/automl/docs/evaluate#bleu. Acesso em: 20 set. 2022.
5. INTERNATIONAL CIVIL AVIATION ORGANIZATION [ICAO].Doc 9859 - Safety Management Manual. 4ed. International Civil Aviation Organization. Montreal. Canadá. 2018.