Affiliation:
1. Department of Industrial Engineering, Tel Aviv University, Tel Aviv, Israel
Abstract
People’s location data are continuously tracked from various devices and sensors, enabling an ongoing analysis of sensitive information that can violate people’s privacy and reveal confidential information. Synthetic data have been used to generate representative location sequences yet to maintain the users’ privacy. Nonetheless, the privacy-accuracy tradeoff between these two measures has not been addressed systematically. In this article, we analyze the use of different synthetic data generation models for long location sequences, including extended short-term memory networks (LSTMs), Markov Chains (MC), and variable-order Markov models (VMMs). We employ different performance measures, such as data similarity and privacy, and discuss the inherent tradeoff. Furthermore, we introduce other measurements to quantify each of these measures. Based on the anonymous data of 300 thousand cellular-phone users, our work offers a road map for developing policies for synthetic data generation processes. We propose a framework for building data generation models and evaluating their effectiveness regarding those accuracy and privacy measures.
Funder
Israel Ministry of Science
Koret Fund for Digital Living
Publisher
Association for Computing Machinery (ACM)
Reference53 articles.
1. Social LSTM: Human Trajectory Prediction in Crowded Spaces
2. An LSTM network for highway trajectory prediction
3. Sameer Badaskar, Sachin Agarwal, and Shilpa Arora. 2008. Identifying real or fake articles: Towards better language modeling. In Proceedings of the 3rd International Joint Conference on Natural Language Processing: Volume-II.
4. A variable order markov model approach for mobility prediction;Bapierre Halgurt;Pervasive Computing,2011
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献