Neural Machine Translation System for English to Indian Language Translation Using MTIL Parallel Corpus

Author:

Premjith B.,Kumar M. Anand,Soman K.P.

Abstract

Abstract Introduction of deep neural networks to the machine translation research ameliorated conventional machine translation systems in multiple ways, specifically in terms of translation quality. The ability of deep neural networks to learn a sensible representation of words is one of the major reasons for this improvement. Despite machine translation using deep neural architecture is showing state-of-the-art results in translating European languages, we cannot directly apply these algorithms in Indian languages mainly because of two reasons: unavailability of the good corpus and Indian languages are morphologically rich. In this paper, we propose a neural machine translation (NMT) system for four language pairs: English–Malayalam, English–Hindi, English–Tamil, and English–Punjabi. We also collected sentences from different sources and cleaned them to make four parallel corpora for each of the language pairs, and then used them to model the translation system. The encoder network in the NMT architecture was designed with long short-term memory (LSTM) networks and bi-directional recurrent neural networks (Bi-RNN). Evaluation of the obtained models was performed both automatically and manually. For automatic evaluation, the bilingual evaluation understudy (BLEU) score was used, and for manual evaluation, three metrics such as adequacy, fluency, and overall ranking were used. Analysis of the results showed the presence of lengthy sentences in English–Malayalam, and the English–Hindi corpus affected the translation. Attention mechanism was employed with a view to addressing the problem of translating lengthy sentences (sentences contain more than 50 words), and the system was able to perceive long-term contexts in the sentences.

Publisher

Walter de Gruyter GmbH

Subject

Artificial Intelligence,Information Systems,Software

Reference78 articles.

1. Moses: open source toolkit for statistical machine translation,2007

2. Development of Malayalam text generator for translation from English,2011

3. An interactive approach to development of English-Tamil machine translation system on the web,2002

4. Rule based machine translation system for English to Malayalam language,2011

5. Choosing the right evaluation for machine translation: an examination of annotator and automatic metric performance on human judgment tasks,2010

Cited by 44 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Computational Approach for Halegannada to Hosagannada Poem Translation;2023 4th International Conference on Intelligent Technologies (CONIT);2024-06-21

2. A Data Selection Approach for Enhancing Low Resource Machine Translation Using Cross Lingual Sentence Representations;2024 IEEE 9th International Conference for Convergence in Technology (I2CT);2024-04-05

3. Automated Multilingual Multimedia Dissemination Of Government Press Releases;2024 5th International Conference on Innovative Trends in Information Technology (ICITIIT);2024-03-15

4. Systematic Review of Morphological and Semantic Analysis in a Low Resource Language;Advances in Computational Intelligence and Robotics;2024-02-27

5. Tulu Language Text Recognition and Translation;IEEE Access;2024

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3