Abstract
Arabic text classification is a process to simultaneously categorize the different contextual Arabic contents into a proper category. In this paper, a novel deep learning Arabic text computer-aided recognition (ArCAR) is proposed to represent and recognize Arabic text at the character level. The input Arabic text is quantized in the form of 1D vectors for each Arabic character to represent a 2D array for the ArCAR system. The ArCAR system is validated over 5-fold cross-validation tests for two applications: Arabic text document classification and Arabic sentiment analysis. For document classification, the ArCAR system achieves the best performance using the Alarabiya-balance dataset in terms of overall accuracy, recall, precision, and F1-score by 97.76%, 94.08%, 94.16%, and 94.09%, respectively. Meanwhile, the ArCAR performs well for Arabic sentiment analysis, achieving the best performance using the hotel Arabic reviews dataset (HARD) balance dataset in terms of overall accuracy and F1-score by 93.58% and 93.23%, respectively. The proposed ArCAR seems to provide a practical solution for accurate Arabic text representation, understanding, and classification.
Funder
Ministry of Science and ICT, South Korea
Korea Government
NRF
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science
Reference60 articles.
1. A survey of text classification algorithms;Aggarwal,2012
2. Robust Arabic Text Categorization by Combining Convolutional and Recurrent Neural Networks
3. Translating Ambiguous Arabic Words Using Text Mining;Mohammed;Int. J. Comput. Sci. Mob. Comput.,2019
4. Machine translation for Arabic dialects (survey)
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献