A Machine Learning-Based Readability Model for Gujarati Texts-Reference-Cited by-同舟云学术

A Machine Learning-Based Readability Model for Gujarati Texts

Published:2023-12-21 Issue: Volume: Page:
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Bhogayata Chandrakant K.¹^ORCID

Affiliation:

1. Maharaja Krishnakumarsinhji Bhavnagar University, Bhavnagar, Gujarat state, India

Abstract

This study aims to develop a machine learning-based model to predict the readability of Gujarati texts. The dataset was fifty prose passages from Gujarati literature. Fourteen lexical and syntactic readability text features were extracted from the dataset using a machine learning algorithm of the unigram POS tagger and three Python programming scripts. Two samples of native Gujarati speaking secondary and higher education students rated the Gujarati texts for readability judgment on a 10-point scale of 'easy' to 'difficult' with the interrater agreement. After dimensionality reduction, seven text features as the independent variables and the mean readability rating as the dependent variable were used to train the readability model. As the students' level of education and gender were related to their readability rating, four readability models for school students, university students, male students, and female students were trained with a backward stepwise multiple linear regression algorithm of supervised machine learning. The trained model is comparable across the raters' groups. The best model is the university students' readability rating model. The model is cross-validated. It explains 91% and 88% of the variance in readability ratings at training and cross-validation, respectively, and its effect size and power are large and high.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3637826

Reference65 articles.

1. Ethem Alpaydin . 2020. Introduction to Machine Learning (4th. ed) . The MIT Press , Cambridge, MA . Ethem Alpaydin. 2020. Introduction to Machine Learning (4th. ed). The MIT Press, Cambridge, MA.

2. Does text complexity matter in the elementary grades? A research synthesis of text difficulty and elementary students' reading fluency and comprehension;Amendum Stevan J.;Educational Psychology Review,2018

3. The relationship between readability and scientific impact: Evidence from emerging technology discourses

4. Alan Bailin and Ann Grafstein . 2016 . Readability: Text and Context . Palgrave Macmillan , New York, NY . Alan Bailin and Ann Grafstein. 2016. Readability: Text and Context. Palgrave Macmillan, New York, NY.

5. Readability Research: An Interdisciplinary Approach

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Self- attention based optimized deep convolutional robust character and numeral recognition from Gujarati language;Multimedia Tools and Applications;2024-07-30