Affiliation:
1. University of Maryland
Abstract
Abstract
With the aim of increasing online privacy, we present a novel, machine-learning based approach to blocking one of the three main ways website visitors are tracked online—canvas fingerprinting. Because the act of canvas fingerprinting uses, at its core, a JavaScript program, and because many of these programs are reused across the web, we are able to fit several machine learning models around a semantic representation of a potentially offending program, achieving accurate and robust classifiers. Our supervised learning approach is trained on a dataset we created by scraping roughly half a million websites using a custom Google Chrome extension storing information related to the canvas. Classification leverages our key insight that the images drawn by canvas fingerprinting programs have a facially distinct appearance, allowing us to manually classify files based on the images drawn; we take this approach one step further and train our classifiers not on the malleable images themselves, but on the more-difficult-to-change, underlying source code generating the images. As a result, ML-CB allows for more accurate tracker blocking.
Reference116 articles.
1. [1] E. Zuckerman, “The internet’s original sin,” The Atlantic, vol. 14, August 2014. [Online]. Available: https://www.theatlantic.com/technology/archive/2014/08/advertising-is-the-internets-original-sin/376041/
2. [2] N. Bielova, “Web tracking technologies and protection mechanisms,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017.
3. [3] G. Acar, C. Eubank, S. Englehardt, M. Juarez, A. Narayanan, and C. Diaz, “The web never forgets: Persistent tracking mechanisms in the wild,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, 2014.
4. [4] J. R. Mayer and J. C. Mitchell, “Third-party web tracking: Policy and technology,” in 2012 IEEE Symposium on Security and Privacy, 2012.
5. [5] N. F. Awad and M. S. Krishnan, “The personalization privacy paradox: An empirical evaluation of information transparency and the willingness to be profiled online for personalization,” MIS Quarterly, vol. 30, no. 1, pp. 13–28, March 2006.
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. CART: A Tool for Making Paper Relevancy Screening
Easier;Journal of Open Source Software;2024-07-16
2. Analysis of Google Ads Settings Over Time: Updated, Individualized, Accurate, and Filtered;Proceedings of the 22nd Workshop on Privacy in the Electronic Society;2023-11-26
3. Audio-Visual Deepfake Detection System Using Multimodal Deep Learning;2023 3rd International Conference on Intelligent Technologies (CONIT);2023-06-23
4. How gullible are web measurement tools?;Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies;2022-11-30
5. FP-Radar: Longitudinal Measurement and Early Detection of Browser Fingerprinting;Proceedings on Privacy Enhancing Technologies;2022-03-03