Affiliation:
1. EPFL, Lausanne, Switzerland.
2. Imperial College London, London, UK.
Abstract
Despite machine learning models being widely used today, the relationship between a model and its training dataset is not well understood. We explore correlation inference attacks, whether and when a model leaks information about the correlations between the input variables of its training dataset. We first propose a model-less attack, where an adversary exploits the spherical parameterization of correlation matrices alone to make an informed guess. Second, we propose a model-based attack, where an adversary exploits black-box model access to infer the correlations using minimal and realistic assumptions. Third, we evaluate our attacks against logistic regression and multilayer perceptron models on three tabular datasets and show the models to leak correlations. We lastly show how extracted correlations can be used as building blocks for attribute inference attacks and enable weaker adversaries. Our results raise fundamental questions on what a model does and should remember from its training set.
Publisher
American Association for the Advancement of Science (AAAS)
Reference51 articles.
1. A Deep Learning Ensemble Approach for Diabetic Retinopathy Detection
2. Y. Wu M. Schuster Z. Chen Q. V. Le M. Norouzi W. Macherey M. Krikun Y. Cao Q. Gao K. Macherey J. Klingner A. Shah M. Johnson X. Liu Ł. Kaiser S. Gouws Y. Kato T. Kudo H. Kazawa K. Stevens G. Kurian N. Patil W. Wang C. Young J. Smith J. Riesa A. Rudnick O. Vinyals G. Corrado M. Hughes J. Dean Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144 (2016).
3. Siri Team (Apple) Deep learning for siri’s voice: On-device deep mixture density networks for hybrid unit selection synthesis (2017); https://machinelearning.apple.com/research/siri-voices.
4. Amazon Rekognition Moderating content; https://docs.aws. amazon.com/rekognition/latest/dg/moderation.html.
5. The rise of deep learning in drug discovery