Visualizing Linguistic Complexity and Proficiency in Learner English Writings-Reference-Cited by-同舟云学术

Visualizing Linguistic Complexity and Proficiency in Learner English Writings

Published:2023-05-25 Issue:2 Volume:40 Page:178-197
ISSN:2056-9017
Container-title:CALICO Journal
language:
Short-container-title:CALICO

Author:

Gaillat Thomas,Lafontaine Antoine^ORCID,Knefati Anas^ORCID

Abstract

In this article, we focus on the design of a second language (L2) formative feedback system that provides linguistic complexity graph reports on the writings of English for special purposes students at the university level. The system is evaluated in light of formative instruction features pointed out in the literature. The significance of complexity metrics is also evaluated. A learner corpus of English classified according to the Common European Framework of References for Languages (CEFR) was processed using a pipeline that computes 83 complexity metrics. By way of analysis of variance (ANOVA) testing, multinomial logistic regression, and clustering methods, we identified and validated a set of nine significant metrics in terms of proficiency levels. Validation with classification gave 67.51% (A level), 60.16% (B level), and 60.47% (C level) balanced accuracy. Clustering showed between 53.10% and 67.37% homogeneity, depending on the level. As a result, these metrics were used to create graphical reports about the linguistic complexity of learner writing. These reports are designed to help language teachers diagnose their students’ writings in comparison with prerecorded cohorts of different proficiencies.

Publisher

Equinox Publishing

Subject

Computer Science Applications,Linguistics and Language,Language and Linguistics,Education

Reference43 articles.

1. Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater® V.2. Journal of Technology, Learning and Assessment, 4(3), 3–29. https://ejournals.bc.edu/index.php/jtla/article/view/1650

2. Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511801686

3. Ballier, N., Canu, S., Petitjean, C., Gasso, G., Balhana, C., …, & Gaillat, T. (2020). Machine learning for learner English. International Journal of Learner Corpus Research, 6(1), 72–103. https://doi.org/10.1075/ijlcr.18012.bal

4. Benoit, K., Watanabe, K., Wang, H., Nulty, P., Obeng, A., …, & Matsuo, A. (2018). quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774. https://doi.org/10.21105/joss.00774

5. Biber, D., Gray, B., Staples, S., & Egbert, J. (2020). Investigating grammatical complexity in L2 English writing research: Linguistic description versus predictive measurement. Journal of English for Academic Purposes, 46, 100869. https://doi.org/10.1016/j.jeap.2020.100869