Affiliation:
1. The British University in Dubai, Dubai International Academic City, UAE
Abstract
In this paper, we propose a hybrid named entity recognition (NER) approach that takes the advantages of rule-based and machine learning-based approaches in order to improve the overall system performance and overcome the knowledge elicitation bottleneck and the lack of resources for underdeveloped languages that require deep language processing, such as Arabic. The complexity of Arabic poses special challenges to researchers of Arabic NER, which is essential for both monolingual and multilingual applications. We used the hybrid approach to develop an Arabic NER system that is capable of recognizing 11 types of Arabic named entities: Person, Location, Organization, Date, Time, Price, Measurement, Percent, Phone Number, ISBN and File Name. Extensive experiments were conducted using decision trees, Support Vector Machines and logistic regression classifiers to evaluate the system performance. The empirical results indicate that the hybrid approach outperforms both the rule-based and the ML-based approaches when they are processed independently. More importantly, our system outperforms the state-of-the-art of Arabic NER in terms of accuracy when applied to ANERcorp standard dataset, with F-measures 0.94 for Person, 0.90 for Location and 0.88 for Organization.
Subject
Library and Information Sciences,Information Systems
Cited by
45 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Building the ArabNER Corpus for Arabic Named Entity Recognition Using ChatGPT and Bard;Lecture Notes in Computer Science;2024
2. BERT for Arabic NLP Applications: Pretraining and Finetuning MSA and Arabic Dialects;Communications in Computer and Information Science;2023-11-21
3. Optimizing Arabic Named Entity Recognition through Active Learning and AraBERT;2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA);2023-09-20
4. Comparing Open Arabic Named Entity Recognition Tools;2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI);2023-08
5. A Novel Named Entity Recognition Mehod for Bus Route Identification in Social Media;2023 2nd International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM);2023-07-25