Abstract
AbstractCurrent student-centred, multilingual, active teaching methodologies require that teachers have continuous access to texts that are adequate in terms of topic and language competence. However, the task of finding appropriate materials is arduous and time consuming for teachers. To build on automatic readability assessment research that could help to assist teachers, we explore the performance of natural language processing approaches when dealing with educational science documents for secondary education. Currently, readability assessment is mainly explored in English. In this work we extend our research to Basque and Spanish together with English by compiling context-specific corpora and then testing the performance of feature-based machine-learning and deep learning models. Based on the evaluation of our results, we find that our models do not generalize well although deep learning models obtain better accuracy and F1 in all configurations. Further research in this area is still necessary to determine reliable characteristics of training corpora and model parameters to ensure generalizability.
Funder
UPV/EHU
Ministerio de Ciencia, Innovación y Universidades
Eusko Jaurlaritza
Publisher
Springer Science and Business Media LLC
Reference64 articles.
1. Agerri, R., Vicente, IS., Campos, J. A., et al. (2020). Give your text representation models some love: the case for Basque. In: Proceedings of the 12th International Conference on Language Resources and Evaluation
2. Arfé, B., Mason, L., & Fajardo, I. (2018). Simplifying informational text structure for struggling readers. Reading and Writing, 31, 2191–2210.
3. Azpiazu, I. M., & Pera, M. S. (2019). Multiattentive recurrent neural network architecture for multilingual readability assessment. Transactions of the Association for Computational Linguistics, 7, 421–436.
4. Ball, P. (2017). It’s not just you: science papers are getting harder to read. Nature. https://doi.org/10.1038/nature.2017.21751 published on 30 March 2017
5. Basch, C. H., Mohlman, J., Hillyer, G. C., et al. (2020). Public health communication in time of crisis: Readability of on-line COVID-19 information. Disaster Medicine and Public Health Preparedness, 14(5), 635–637.