Enhancing privacy for automatically detected quasi identifier using data anonymization

Author:

Sathiya Devi S.1,Indhumathi R.2

Affiliation:

1. Department of CSE, UCE, BIT Campus, Anna University, Trichy, Tamil Nadu, India

2. Department of CSE, M.A.M College of Engineering & Technology, Trichy, Tamil Nadu, India

Abstract

The fast advancement of information technology has resulted in more efficient information storage and retrieval. As a result, most organizations, businesses, and governments are releasing and exchanging a large amount of micro data among themselves for commercial or research purposes. However, incorrect data exchange will result in privacy breaches. Many methods and strategies have been developed to address privacy breaches, and Anonymization is one of them that many companies use. In order to perform anonymization, identification of the Quasi Identifier (QI) is significant. Hence this paper proposes a method called Quasi Identification Based on Tree (QIBT) for automatic QI identification. The proposed method derives the QI, based on the relationship between the numbers of distinct values assumed by the set of attributes. So, it uses the tree data structure to derive the unique and infrequent attribute values from the entire dataset with less computational cost. The proposed method consists of four phases: (i) Unique attribute value computation (ii) Tree construction and (iii) Computation of quasi-identifier from the tree (iv) Applying Anonymization Technique to the identified QI. Attributes with high risk of disclosure are identified using our proposed algorithm. Synthetic data are created exclusively for the detected QI using a partial synthetic data generating technique to improve usefulness. The suggested method’s efficiency is tested with a subset of the UCI machine learning dataset and produces superior results when compared to other current approaches.

Publisher

IOS Press

Subject

Artificial Intelligence,Computer Networks and Communications,Software

Reference36 articles.

1. A survey of privacy solutions using blockchain for recommender systems: Current status, classification and open issues;Abdulmunim Abduljabbar;The Computer Journal,2021

2. C.C. Aggarwal and P.S. Yu, Privacy-Preserving Data Mining: Models and Algorithms, Springer Series in Advances in Database Systems, Vol. 38, 2008.

3. Factors to be considered in cloud computing adoption;Ali;Web Intell.,2016

4. L. Bergroth, H. Hakonen and T. Raita, A survey of longest common subsequence algorithms, in: String Processing and Information Retrieval, Proc. 7th Int. Symp., 2000, pp. 39–48.

5. Infrequent weighted item set mining using frequent pattern growth;Cagliero;IEEE Transactions on Knowledge & Data Engineering,2014

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3