Affiliation:
1. Department of Computer Sciences, Yarmouk University, Jordan
2. Department of Management Information System, Yarmouk University, Jordan
3. Department of Computer Information System, Yarmouk University, Jordan
Abstract
Maintaining the topical coherence while writing a discourse is a major challenge confronting novice and non-novice writers alike. This challenge is even more intense with Arabic discourse because of the complex morphology and the widespread of synonyms in Arabic language. In this research, we present a direct classification of Arabic discourse document while writing. This prescriptive proposed framework consists of the following stages: data collection, pre-processing, construction of Language Model (LM), topics identification, topics classification, and topic notification. To prove and demonstrate our proposed framework, we designed a system and applied it on a corpus of 2800 Arabic discourse documents synthesized into four predefined topics related to: Culture, Economy, Sport, and Religion. System performance was analysed, in terms of accuracy, recall, precision, and F-measure. The results demonstrated that the proposed topic modeling-based decision framework is able to classify topics while writing a discourse with accuracy of 91.0%.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献