Semantic Disclosure Control: semantics meets data privacy-Reference-Cited by-同舟云学术

Semantic Disclosure Control: semantics meets data privacy

Published:2018-06-11 Issue:3 Volume:42 Page:290-303
ISSN:1468-4527
Container-title:Online Information Review
language:en
Short-container-title:OIR

Author:

Batet Montserrat,Sánchez David

Abstract

Purpose To overcome the limitations of purely statistical approaches to data protection, the purpose of this paper is to propose Semantic Disclosure Control (SeDC): an inherently semantic privacy protection paradigm that, by relying on state of the art semantic technologies, rethinks privacy and data protection in terms of the meaning of the data. Design/methodology/approach The need for data protection mechanisms able to manage data from a semantic perspective is discussed and the limitations of statistical approaches are highlighted. Then, SeDC is presented by detailing how it can be enforced to detect and protect sensitive data. Findings So far, data privacy has been tackled from a statistical perspective; that is, available solutions focus just on the distribution of the data values. This contrasts with the semantic way by which humans understand and manage (sensitive) data. As a result, current solutions present limitations both in preventing disclosure risks and in preserving the semantics (utility) of the protected data. Practical implications SeDC captures more general, realistic and intuitive notions of privacy and information disclosure than purely statistical methods. As a result, it is better suited to protect heterogenous and unstructured data, which are the most common in current data release scenarios. Moreover, SeDC preserves the semantics of the protected data better than statistical approaches, which is crucial when using protected data for research. Social implications Individuals are increasingly aware of the privacy threats that the uncontrolled collection and exploitation of their personal data may produce. In this respect, SeDC offers an intuitive notion of privacy protection that users can easily understand. It also naturally captures the (non-quantitative) privacy notions stated in current legislations on personal data protection. Originality/value On the contrary to statistical approaches to data protection, SeDC assesses disclosure risks and enforces data protection from a semantic perspective. As a result, it offers more general, intuitive, robust and utility-preserving protection of data, regardless their type and structure.

Publisher

Emerald

Subject

Library and Information Sciences,Computer Science Applications,Information Systems

Reference32 articles.

1. Significance of term relationships on anonymization,2011

2. t-Plausibility: generalizing words to desensitize text;Transactions on Data Privacy,2012

3. Batet, M. and Sánchez, D. (2014), “Review on semantic similarity”, in Khosrow-Pour, M. (Ed.), Encyclopedia of Information Science and Technology, 3rd ed., IGI Global, pp. 7575-7583.

4. Utility preserving query log anonymization via semantic microaggregation;Information Sciences,2013

5. The Rules of Redaction: identify, protect, review (and repeat);IEEE Security and Privacy Magazine,2009

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack;Data Mining and Knowledge Discovery;2024-09-03

2. A privacy-preserving dialogue system based on argumentation;Intelligent Systems with Applications;2022-11

3. The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization;Computational Linguistics;2022

4. Utility-Preserving Privacy Protection of Textual Documents via Word Embeddings;IEEE Transactions on Knowledge and Data Engineering;2021

5. Social media analytics: analysis and visualisation of news diffusion using NodeXL;Online Information Review;2019-02-11