Author:
Blum Avrim,Hopcroft John,Kannan Ravindran
Abstract
This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.
Publisher
Cambridge University Press
Reference346 articles.
1. Dasgupta, Anirban , Hopcroft, John E. , Kleinberg, Jon M. , and Sandler, Mark . On learning mixtures of heavy-tailed distributions. In FOCS, pages 491–500, 2005.
2. Constructing Free-Energy Approximations and Generalized Belief Propagation Algorithms
3. Settling the Polynomial Learnability of Mixtures of Gaussians
4. Chaudhuri, Kamalika and Rao, Satish . Learning mixtures of product distributions using correlations and independence. In COLT, pages 9–20, 2008.
5. Probabilistic counting algorithms for data base applications
Cited by
107 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献