Abstract
AbstractQuestion generation (QG) from a given context paragraph is a demanding task in natural language processing for its practical applications and prospects in various fields. Several studies have been conducted on QG in high-resource languages like English, however, very few have been done on resource-poor languages like Arabic and Bangla. In this work, we propose a finetuning method for QG that uses pre-trained transformer-based language models to generate questions from a given context paragraph in Bangla. Our approach is based on the idea that a transformer-based language model can be used to learn the relationships between words and phrases in a context paragraph which allows the models to generate questions that are both relevant and grammatically correct. We finetuned three different transformer models: (1) BanglaT5, (2) mT5-base, (3) BanglaGPT2, and demonstrated their capabilities using two different data formatting techniques: (1) AQL—All Question Per Line, (2) OQL—One Question Per Line, making it a total of six different variations of QG models. For each of these variants, six different decoding algorithms: (1) Greedy search, (2) Beam search, (3) Random Sampling, (4) Top K sampling, (5) Top- p Sampling, 6) a combination of Top K and Top-p Sampling were used to generate questions from the test dataset. For evaluation of the quality of questions generated using different models and decoding techniques, we also fine-tuned another transformer model BanglaBert on two custom datasets of our own and created two question classifier (QC) models that check the relevancy and Grammatical correctness of the questions generated by our QG models. The QC models showed test accuracy of 88.54% and 95.76% in the case of correctness and relevancy checks, respectively. Our results show that among all the variants of the QG, the mT5 OQL approach and beam decoding algorithm outperformed all the other ones in terms of relevancy (77%) and correctness (96%) with 36.60 Bleu_4, 48.98 METEOR, and 63.38 ROUGE-L scores.
Publisher
Springer Science and Business Media LLC
Reference38 articles.
1. Kurdi G, Leo J, Parsia B, Sattler U, Al-Emari S (2020) A systematic review of automatic question generation for educational purposes. Int J Artif Intell Educ 30:121–204
2. Steuer T, Filighera A, Tregel T, Miede A (2022) Educational automatic question generation improves reading comprehension in non-native speakers: a learner-centric case study. Front Artif Intell 5:900304. https://doi.org/10.3389/frai.2022.900304
3. Emerson J (2023) Transformer-based multi-hop question generation. In: Proceedings of the AAAI conference on artificial intelligence, vol. 37, no. 13, 16206–16207. https://doi.org/10.1609/aaai.v37i13.26963.
4. Dugan L, Miltsakaki E, Upadhyay S, Ginsberg E, Gonzalez H, Choi D, Yuan C, Callison-Burch C (2022) A feasibility study of answer-agnostic question generation for education. Findings of the Association for Computational Linguistics: ACL 2022:1919–1926
5. Nappi JS (2017) The importance of questioning in developing critical thinking skills. Delta Kappa Gamma Bulletin 84(1):30