Sentence splitting in Arabic to Spanish translation-Reference-Cited by-同舟云学术

Sentence splitting in Arabic to Spanish translation

Published:2023-07-04 Issue:2 Volume:36 Page:585-614
ISSN:0213-2028
Container-title:Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics
language:en
Short-container-title:RESLA

Author:

Roldán Juan¹^ORCID,Feria García Manuel¹^ORCID

Affiliation:

1. University of Granada

Abstract

Abstract Modern Standard Arabic makes extensive use of coordination particles whereas punctuation marks are scarce and erratic, leading to long clauses. This is generally assumed to hinder Sentence Boundary Detection and to promote sentence splitting when translating from Arabic into English. Previous literature on translation from Arabic to Spanish is practically inexistent. We have tested this hypothesis regarding translation from Arabic to Spanish on a sample of 282,714 graphic words extracted from a bilingual corpus of 8,681,110 graphic words and found that each Arabic sentence yielded an average of 1.5 Spanish sentences. Furthermore, our data shows the potential impact of directionality in that sentence splitting when translating from Arabic into Spanish is 50% more frequent than from English into Arabic. We also determined statistically that five elements (wa [و], ḥaythu [حيث], kamā [كما], wa-qad [وقد], and wa-dhalika [وذلك]) are the most salient potential markers for sentence splitting in the resulting Spanish translations. Our findings should be particularly interesting for Computational Linguistics and translator training.

Publisher

John Benjamins Publishing Company

Subject

Linguistics and Language,Language and Linguistics

Link

http://www.jbe-platform.com/deliver/fulltext/resla.21008.rol.pdf

Reference58 articles.

1. Comparing Machine Translation and Human Translation: A Case Study

2. The Discourse Marker wa in Standard Arabic—A Syntactic and Semantic Analysis

3. The undergraduate learner translator corpus: a new resource for translation studies and computational linguistics

4. Experimental evaluation of Arabic OCR systems