Abstract
The use of stylometry, authorship recognition through purely linguistic means, has contributed to literary, historical, and criminal investigation breakthroughs. Existing stylometry research assumes that authors have not attempted to disguise their linguistic writing style. We challenge this basic assumption of existing stylometry methodologies and present a new area of research: adversarial stylometry. Adversaries have a devastating effect on the robustness of existing classification methods. Our work presents a framework for creating adversarial passages including
obfuscation
, where a subject attempts to hide her identity, and
imitation
, where a subject attempts to frame another subject by imitating his writing style, and
translation
where original passages are obfuscated with machine translation services. This research demonstrates that manual circumvention methods work very well while automated translation methods are not effective. The obfuscation method reduces the techniques' effectiveness to the level of random guessing and the imitation attempts succeed up to 67% of the time depending on the stylometry technique used. These results are more significant given the fact that experimental subjects were unfamiliar with stylometry, were not professional writers, and spent little time on the attacks. This article also contributes to the field by using human subjects to empirically validate the claim of high accuracy for four current techniques (without adversaries). We have also compiled and released two corpora of adversarial stylometry texts to promote research in this field with a total of 57 unique authors. We argue that this field is important to a multidisciplinary approach to privacy, security, and anonymity.
Funder
Intel Corporation
Defense Advanced Research Projects Agency
Publisher
Association for Computing Machinery (ACM)
Subject
Safety, Risk, Reliability and Quality,General Computer Science
Reference26 articles.
1. Writeprints
2. Adams C. 2006. With a little help from my friends (and colleagues): The multidisciplinary requirement for privacy. http://www.idtrail.org/content/view/402/42/. Adams C. 2006. With a little help from my friends (and colleagues): The multidisciplinary requirement for privacy. http://www.idtrail.org/content/view/402/42/.
3. Detecting Hoaxes, Frauds, and Deception in Writing Style Online
4. Brennan M. and Greenstadt R. 2009. Practical attacks on authorship recognition techniques. Innov. Appl. Artif. Intell. Brennan M. and Greenstadt R. 2009. Practical attacks on authorship recognition techniques. Innov. Appl. Artif. Intell.
Cited by
110 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献