Affiliation:
1. SRM Institute of Science and Technology
Abstract
In the field of Natural Language Processing, Question Answering is a cardinal task that has garnered a lot of attention. With the development of multiple language models, question answering systems have been developed and deployed to facilitate enhanced information retrieval. These systems, however, have been implemented to a large extent only in English. Our objective was to create such a question answering system for the Tamil Language. We decided to use XLM-RoBERTa as our language model, which has been trained on a variety of datasets. We have also employed a hand-annotated dataset for the purpose of validation. We trained the model on two types of datasets, the first one being only in Tamil, whereas the other one being a mixture of Indian languages along with Tamil. The results were satisfactory in both cases. Given the huge amount of computational power the model required for training, we utilized the Colab Pro Plus cloud GPU from Google to satisfy our demands. We will also be publishing our dataset on huggingface so that fellow researchers can use it for further analysis.
Publisher
Trans Tech Publications Ltd
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Context-Aware Auto-Encoded Graph Neural Model for Dynamic Question Generation using NLP;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-10-05