Enhancing privacy for automatically detected quasi identifier using data anonymization-Reference-Cited by-同舟云学术

Enhancing privacy for automatically detected quasi identifier using data anonymization

Published:2023-04-05 Issue:1 Volume:21 Page:71-91
ISSN:2405-6464
Container-title:Web Intelligence
language:
Short-container-title:WEB

Author:

Sathiya Devi S.¹,Indhumathi R.²

Affiliation:

1. Department of CSE, UCE, BIT Campus, Anna University, Trichy, Tamil Nadu, India

2. Department of CSE, M.A.M College of Engineering & Technology, Trichy, Tamil Nadu, India

Abstract

The fast advancement of information technology has resulted in more efficient information storage and retrieval. As a result, most organizations, businesses, and governments are releasing and exchanging a large amount of micro data among themselves for commercial or research purposes. However, incorrect data exchange will result in privacy breaches. Many methods and strategies have been developed to address privacy breaches, and Anonymization is one of them that many companies use. In order to perform anonymization, identification of the Quasi Identifier (QI) is significant. Hence this paper proposes a method called Quasi Identification Based on Tree (QIBT) for automatic QI identification. The proposed method derives the QI, based on the relationship between the numbers of distinct values assumed by the set of attributes. So, it uses the tree data structure to derive the unique and infrequent attribute values from the entire dataset with less computational cost. The proposed method consists of four phases: (i) Unique attribute value computation (ii) Tree construction and (iii) Computation of quasi-identifier from the tree (iv) Applying Anonymization Technique to the identified QI. Attributes with high risk of disclosure are identified using our proposed algorithm. Synthetic data are created exclusively for the detected QI using a partial synthetic data generating technique to improve usefulness. The suggested method’s efficiency is tested with a subset of the UCI machine learning dataset and produces superior results when compared to other current approaches.

Publisher

IOS Press

Subject

Artificial Intelligence,Computer Networks and Communications,Software

Reference36 articles.

1. A survey of privacy solutions using blockchain for recommender systems: Current status, classification and open issues;Abdulmunim Abduljabbar;The Computer Journal,2021

2. C.C. Aggarwal and P.S. Yu, Privacy-Preserving Data Mining: Models and Algorithms, Springer Series in Advances in Database Systems, Vol. 38, 2008.

3. Factors to be considered in cloud computing adoption;Ali;Web Intell.,2016

4. L. Bergroth, H. Hakonen and T. Raita, A survey of longest common subsequence algorithms, in: String Processing and Information Retrieval, Proc. 7th Int. Symp., 2000, pp. 39–48.

5. Infrequent weighted item set mining using frequent pattern growth;Cagliero;IEEE Transactions on Knowledge & Data Engineering,2014