Integrating LSA-based hierarchical conceptual space and machine learning methods for leveling the readability of domain-specific texts-Reference-Cited by-同舟云学术

Integrating LSA-based hierarchical conceptual space and machine learning methods for leveling the readability of domain-specific texts

Published:2019-04-05 Issue:3 Volume:25 Page:331-361
ISSN:1351-3249
Container-title:Natural Language Engineering
language:en
Short-container-title:Nat. Lang. Eng.

Author:

Tseng Hou-Chiang,Chen Berlin,Chang Tao-Hsing,Sung Yao-Ting

Abstract

AbstractText readability assessment is a challenging interdisciplinary endeavor with rich practical implications. It has long drawn the attention of researchers internationally, and the readability models since developed have been widely applied to various fields. Previous readability models have only made use of linguistic features employed for general text analysis and have not been sufficiently accurate when used to gauge domain-specific texts. In view of this, this study proposes a latent-semantic-analysis (LSA)-constructed hierarchical conceptual space that can be used to train a readability model to accurately assess domain-specific texts. Compared with a baseline reference using a traditional model, the new model improves by 13.88% to achieve 68.98% of accuracy when leveling social science texts, and by 24.61% to achieve 73.96% of accuracy when assessing natural science texts. We then combine the readability features developed for the current study with general linguistic features, and the accuracy of leveling social science texts improves by an even higher degree of 31.58% to achieve 86.68%, and that of natural science texts by 26.56% to achieve 75.91%. These results indicate that the readability features developed in this study can be used both to train a readability model for leveling domain-specific texts and also in combination with the more common linguistic features to enhance the efficacy of the model. Future research can expand the generalizability of the model by assessing texts from different fields and grade levels using the proposed method, thus enhancing the practical applications of this new method.

Publisher

Cambridge University Press (CUP)

Subject

Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software

Reference107 articles.

1. Domain-specific iterative readability computation

Cited by 18 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Earnings management, key audit matters and audit report readability;Pacific Accounting Review;2024-09-09

2. Evaluation of the quality and readability of ChatGPT responses to frequently asked questions about myopia in traditional Chinese language;DIGITAL HEALTH;2024-01

3. Readability Grading Based on Multidimensional Linguistics Features for International Chinese Language Education;IEEE Access;2024

4. Computers’ Interpretations of Knowledge Representation Using Pre-Conceptual Schemas: An Approach Based on the BERT and Llama 2-Chat Models;Big Data and Cognitive Computing;2023-12-14

5. Text Complexity of Chinese Elementary School Textbooks: Analysis of Text Linguistic Features Using Machine Learning Algorithms;Scientific Studies of Reading;2023-08-14