Proximity Measurement for Hierarchical Categorical Attributes in Big Data-Reference-Cited by-同舟云学术

Proximity Measurement for Hierarchical Categorical Attributes in Big Data

Published:2021-07-05 Issue: Volume:2021 Page:1-17
ISSN:1939-0122
Container-title:Security and Communication Networks
language:en
Short-container-title:Security and Communication Networks

Author:

El Ouazzani Zakariae¹^ORCID,Braeken An²^ORCID,El Bakkali Hanan¹^ORCID

Affiliation:

1. Rabat-IT Center, ENSIAS, Mohammed V University in Rabat, Rabat, Morocco

2. Industrial Engineering Department (INDI), Vrije Universiteit Brussel (VUB), Brussels, Belgium

Abstract

Nearly most of the organizations store massive amounts of data in large databases for research, statistics, and mining purposes. In most cases, much of the accumulated data contain sensitive information belonging to individuals which may breach privacy. Hence, ensuring privacy in big data is considered a very important issue. The concept of privacy aims to protect sensitive information from various attacks that may violate the identity of individuals. Anonymization techniques are considered the best way to ensure privacy in big data. Various works have been already realized, taking into account horizontal clustering. The L-diversity technique is one of those techniques dealing with sensitive numerical and categorical attributes. However, the majority of anonymization techniques using L-diversity principle for hierarchical data cannot resist the similarity attack and therefore cannot ensure privacy carefully. In order to prevent the similarity attack while preserving data utility, a hybrid technique dealing with categorical attributes is proposed in this paper. Furthermore, we highlighted all the steps of our proposed algorithm with detailed comments. Moreover, the algorithm is implemented and evaluated according to a well-known information loss-based criterion which is Normalized Certainty Penalty (NCP). The obtained results show a good balance between privacy and data utility.

Publisher

Hindawi Limited

Subject

Computer Networks and Communications,Information Systems

Link

http://downloads.hindawi.com/journals/scn/2021/6612923.pdf

Reference56 articles.

1. Big data set privacy preserving through sensitive attribute-based grouping

2. Big healthcare data: preserving security and privacy

3. Privacy models for big data: a survey

4. An evaluation on big data generalization using k-Anonymity algorithm on cloud

5. Maximum delay anonymous clustering feature tree based privacy-preserving data publishing in social networks

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A divide-and-conquer approach to privacy-preserving high-dimensional big data release;Journal of Information Security and Applications;2024-06

2. Towards Optimization of Privacy-Utility Trade-Off Using Similarity and Diversity Based Clustering;IEEE Transactions on Emerging Topics in Computing;2023