Privately vertically mining of sequential patterns based on differential privacy with high efficiency and utility

Author:

Liang Wenjuan,Zhang Wenke,Liang Songtao,Yuan Caihong

Abstract

AbstractSequential pattern mining is one of the fundamental tools for many important data analysis tasks, such as web browsing behavior analysis. Based on frequent patterns, decision-makers can obtain both economic gains and social values. Sequential data, on the other hand, frequently contain sensitive information, and directly analyzing these data will raise user concerns from a privacy perspective. Differential privacy (DP), as the most popular privacy model, has been employed to address this privacy concern. Most existing DP-Solutions are designed to combine horizontal sequence pattern mining algorithms with differential privacy. Due to the inefficiency of horizontal algorithms, their DP-Solutions cannot ensure high efficiency and accuracy while offering a high privacy guarantee. Therefore, we proposed privVertical, a new private sequence pattern mining scheme combining the vertical mining algorithm with differential privacy to achieve the above objective. Unlike DP-solutions based on horizontal algorithms, privVertical can promote efficiency by avoiding performing costly database scans or costly projection database constructions. Moreover, to promote accuracy, a differentially private hash MapList (called privHashMap) is designed to record frequent concurrency items and their noisy support based on the Sparse Vector Technique. PrivHashMap is used to pre-pruning excessive infrequent candidate sequences in private mining, and Sparse Vector Technique is used to promote the accuracy of PrivHashMap. After pruning these invalid candidate sequences, less noise is required to achieve the same level of privacy, increasing the accuracy of private mining. Theoretical privacy analysis proves privVertical satisfies $$\varepsilon$$ ε -differential privacy. Experiments show that privVertical achieves higher accuracy and efficiency while achieving the same privacy level.

Funder

Scientific and Technological Project of Henan Province of China

National Natural Science Foundation of China

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3