On the (In)effectiveness of Mosaicing and Blurring as Tools for Document Redaction-Reference-Cited by-同舟云学术

On the (In)effectiveness of Mosaicing and Blurring as Tools for Document Redaction

Published:2016-07-14 Issue:4 Volume:2016 Page:403-417
ISSN:2299-0984
Container-title:Proceedings on Privacy Enhancing Technologies
language:en
Short-container-title:

Author:

Hill Steven¹,Zhou Zhimin,Saul Lawrence,Shacham Hovav

Affiliation:

1. UC San Diego

Abstract

Abstract In many online communities, it is the norm to redact names and other sensitive text from posted screenshots. Sometimes solid bars are used; sometimes a blur or other image transform is used. We consider the effectiveness of two popular image transforms - mosaicing (also known as pixelization) and blurring - for redaction of text. Our main finding is that we can use a simple but powerful class of statistical models - so-called hidden Markov models (HMMs) - to recover both short and indefinitely long instances of redacted text. Our approach borrows on the success of HMMs for automatic speech recognition, where they are used to recover sequences of phonemes from utterances of speech. Here we use HMMs in an analogous way to recover sequences of characters from images of redacted text. We evaluate an implementation of our system against multiple typefaces, font sizes, grid sizes, pixel offsets, and levels of noise. We also decode numerous real-world examples of redacted text. We conclude that mosaicing and blurring, despite their widespread usage, are not viable approaches for text redaction.

Publisher

Walter de Gruyter GmbH

Subject

General Medicine

Cited by 38 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploring How UK Public Authorities Use Redaction to Protect Personal Information;ACM Transactions on Management Information Systems;2024-09-11

2. Toward a Privacy-Preserving Face Recognition System: A Survey of Leakages and Solutions;ACM Computing Surveys;2024-06-17

3. Leveraging deep learning-assisted attacks against image obfuscation via federated learning;Neural Computing and Applications;2024-05-18

4. Manipulable, reversible and diversified de-identification via face identity disentanglement;Multimedia Tools and Applications;2024-02-19

5. Privacy-Enhancing Person Re-identification Framework – A Dual-Stage Approach;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03