Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic

Author:

AbuZeina Dia1ORCID,Abdalbaset Taqieddin Mostafa2

Affiliation:

1. College of Information Technology and Computer Engineering, Palestine Polytechnic University, Hebron, State of Palestine

2. Palestine Technical University–Kadoorie, AL-Aroub Branch, Hebron, State of Palestine

Abstract

The part of speech (PoS) tagging is a core component in many natural language processing (NLP) applications. In fact, the PoS taggers contribute as a preprocessing step in various NLP tasks, such as syntactic parsing, information extraction, machine translation, and speech synthesis. In this paper, we examine the performance of a modern standard Arabic (MSA) based tagger for the classical (i.e., traditional or historical) Arabic. In this work, we employed the Stanford Arabic model tagger to evaluate the imperative verbs in the Holy Quran. In fact, the Stanford tagger contains 29 tags; however, this work experimentally evaluates just one that is the VB ≡ imperative verb. The testing set contains 741 imperative verbs, which appear in 1,848 positions in the Holy Quran. Despite the previously reported accuracy of the Arabic model of the Stanford tagger, which is 96.26% for all tags and 80.14% for unknown words, the experimental results show that this accuracy is only 7.28% for the imperative verbs. This result promotes the need for further research to expose why the tagging is severely inaccurate for classical Arabic. The performance decline might be an indication of the necessity to distinguish between training data for both classical and MSA Arabic for NLP tasks.

Funder

Palestine Polytechnic University

Publisher

Hindawi Limited

Subject

Computational Mathematics,Control and Optimization,Control and Systems Engineering

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. A BERT Based Approach for Arabic POS Tagging;Advances in Computational Intelligence;2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3