A survey of automatic Arabic diacritization techniques-Reference-Cited by-同舟云学术

A survey of automatic Arabic diacritization techniques

Published:2013-10-10 Issue:3 Volume:21 Page:477-495
ISSN:1351-3249
Container-title:Natural Language Engineering
language:en
Short-container-title:Nat. Lang. Eng.

Author:

AZMI AQIL M.,ALMAJED REHAM S.

Abstract

AbstractIn Modern Standard Arabic texts are typically written without diacritical markings. The diacritics are important to clarify the sense and meaning of words. Lack of these markings may lead to ambiguity even for the natives. Often the natives successfully disambiguate the meaning through the context; however, many Arabic applications, such as machine translation, text-to-speech, and information retrieval, are vulnerable due to lack of diacritics. The process of automatically restoring diacritical marks is called diacritization or diacritic restoration. In this paper we discuss the properties of the Arabic language and the issues that are related to the lack of the diacritical marking. It will be followed by a survey of the recent algorithms that were developed to solve the diacritization problem. We also look into the future trend for researchers working in this area.

Publisher

Cambridge University Press (CUP)

Subject

Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software

Reference47 articles.

1. Zerrouki T. 2011. Tashkeela: Arabic vocalized text corpus. Retreived June 9, 2013, from http://aracorpus.e3rab.com/.

2. Wikipedia. n.d. Danish and Norwegian alphabet. Retreived March 17, 2013, from http://en.wikipedia.org/wiki/Danish_and_Norwegian_alphabet.

3. A hybrid approach for building Arabic diacritizer

4. Cross-dialectal data sharing for acoustic modeling in Arabic speech recognition

5. A Stochastic Arabic Diacritizer Based on a Hybrid of Factorized and Unfactorized Textual Features

Cited by 51 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. BERT-Based Arabic Diacritization: A state-of-the-art approach for improving text accuracy and pronunciation;Expert Systems with Applications;2024-08

2. Unlocking the language barrier: A Journey through Arabic machine translation;Multimedia Tools and Applications;2024-06-14

3. Unlocking the Power of Transfer Learning with Ad-Dabit-Al-Lughawi: A Token Classification Approach for Enhanced Arabic Text Diacritization;2024

4. A Hybrid Arabic text summarization Approach based on Seq-to-seq and Transformer;2023-03-15

5. Systematic Review of Automatic Arabic Text Summarization Techniques;Business Intelligence and Information Technology;2023