An NLP-Based Exploration of Variance in Student Writing and Syntax: Implications for Automated Writing Evaluation-Reference-Cited by-同舟云学术

An NLP-Based Exploration of Variance in Student Writing and Syntax: Implications for Automated Writing Evaluation

Published:2024-06-25 Issue:7 Volume:13 Page:160
ISSN:2073-431X
Container-title:Computers
language:en
Short-container-title:Computers

Author:

Goldshtein Maria¹,Alhashim Amin G.²^ORCID,Roscoe Rod D.¹^ORCID

Affiliation:

1. Human Systems Engineering, Arizona State University, Mesa, AZ 85212, USA

2. Mathematics, Statistics, and Computer Science, Macalester College, Saint Paul, MN 55105, USA

Abstract

In writing assessment, expert human evaluators ideally judge individual essays with attention to variance among writers’ syntactic patterns. There are many ways to compose text successfully or less successfully. For automated writing evaluation (AWE) systems to provide accurate assessment and relevant feedback, they must be able to consider similar kinds of variance. The current study employed natural language processing (NLP) to explore variance in syntactic complexity and sophistication across clusters characterized in a large corpus (n = 36,207) of middle school and high school argumentative essays. Using NLP tools, k-means clustering, and discriminant function analysis (DFA), we observed that student writers employed four distinct syntactic patterns: (1) familiar and descriptive language, (2) consistently simple noun phrases, (3) variably complex noun phrases, and (4) moderate complexity with less familiar language. Importantly, each pattern spanned the full range of writing quality; there were no syntactic patterns consistently evaluated as “good” or “bad”. These findings support the need for nuanced approaches in automated writing assessment while informing ways that AWE can participate in that process. Future AWE research can and should explore similar variability across other detectable elements of writing (e.g., vocabulary, cohesion, discursive cues, and sentiment) via diverse modeling methods.

Funder

Gates Foundation

Publisher

MDPI AG

Link

https://www.mdpi.com/2073-431X/13/7/160/pdf