Nonuniform language in technical writing: Detection and correction-Reference-Cited by-同舟云学术

Nonuniform language in technical writing: Detection and correction

Published:2020-03-06 Issue:3 Volume:27 Page:293-314
ISSN:1351-3249
Container-title:Natural Language Engineering
language:en
Short-container-title:Nat. Lang. Eng.

Author:

Wang Weibo,Islam Aminul,Moh’d Abidalrahman,Soto Axel J.^ORCID,Milios Evangelos E.

Abstract

AbstractTechnical writing in professional environments, such as user manual authoring, requires the use of uniform language. Nonuniform language refers to sentences in a technical document that are intended to have the same meaning within a similar context, but use different words or writing style. Addressing this nonuniformity problem requires the performance of two tasks. The first task, which we named nonuniform language detection (NLD), is detecting such sentences. We propose an NLD method that utilizes different similarity algorithms at lexical, syntactic, semantic and pragmatic levels. Different features are extracted and integrated by applying a machine learning classification method. The second task, which we named nonuniform language correction (NLC), is deciding which sentence among the detected ones is more appropriate for that context. To address this problem, we propose an NLC method that combines contraction removal, near-synonym choice, and text readability comparison. We tested our methods using smartphone user manuals. We finally compared our methods against state-of-the-art methods in paraphrase detection (for NLD) and against expert annotators (for both NLD and NLC). The experiments demonstrate that the proposed methods achieve performance that matches expert annotators.

Publisher

Cambridge University Press (CUP)

Subject

Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software

Reference51 articles.

1. Near Duplicate Text Detection Using Frequency-Biased Signatures

2. Detecting near-duplicates for web crawling

3. Evaluation of n-Gram-Based Classification Approaches on Classical Music Corpora

4. Wu, Z. and Palmer, M. (1994). Verbs Semantics and Lexical Selection. Available at: Demo URL: http://ws4jdemo.appspot.com/?mode=w&s1=&w1=photo&s2=&w2=video (Accessed 01 December 2015).

5. Semantic text similarity using corpus-based word similarity and string similarity

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Urdu paraphrase detection: A novel DNN-based implementation using a semi-automatically generated corpus;Natural Language Engineering;2023-05-29