Automatic Classification of Text Complexity-Reference-Cited by-同舟云学术

Automatic Classification of Text Complexity

Published:2020-10-18 Issue:20 Volume:10 Page:7285
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Santucci Valentino^ORCID,Santarelli Filippo^ORCID,Forti Luciana,Spina Stefania^ORCID

Abstract

This work introduces an automatic classification system for measuring the complexity level of a given Italian text under a linguistic point-of-view. The task of measuring the complexity of a text is cast to a supervised classification problem by exploiting a dataset of texts purposely produced by linguistic experts for second language teaching and assessment purposes. The commonly adopted Common European Framework of Reference for Languages (CEFR) levels were used as target classification classes, texts were elaborated by considering a large set of numeric linguistic features, and an experimental comparison among ten widely used machine learning models was conducted. The results show that the proposed approach is able to obtain a good prediction accuracy, while a further analysis was conducted in order to identify the categories of features that influenced the predictions.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/10/20/7285/pdf

Reference80 articles.

1. Neural Network Methods for Natural Language Processing

2. Recent Trends in Deep Learning Based Natural Language Processing [Review Article]

3. Recent automatic text summarization techniques: a survey

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Bridging Linguistic Gaps: Developing a Greek Text Simplification Dataset;Information;2024-08-20

2. Using GPT-3 as a Text Data Augmentator for a Complex Text Detector;2023 IEEE 5th International Conference on BioInspired Processing (BIP);2023-11-28

3. Topic Modeling for Text Structure Assessment: The case of Russian Academic Texts;Journal of Language and Education;2023-09-30

4. Twenty Years of Machine-Learning-Based Text Classification: A Systematic Review;Algorithms;2023-04-29

5. A Corpus-Based Word Classification Method for Detecting Difficulty Level of English Proficiency Tests;Applied Sciences;2023-01-29