Foundations of Data Science

Author:

Blum Avrim,Hopcroft John,Kannan Ravindran

Abstract

This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.

Publisher

Cambridge University Press

Reference346 articles.

1. Dasgupta, Anirban , Hopcroft, John E. , Kleinberg, Jon M. , and Sandler, Mark . On learning mixtures of heavy-tailed distributions. In FOCS, pages 491–500, 2005.

2. Constructing Free-Energy Approximations and Generalized Belief Propagation Algorithms

3. Settling the Polynomial Learnability of Mixtures of Gaussians

4. Chaudhuri, Kamalika and Rao, Satish . Learning mixtures of product distributions using correlations and independence. In COLT, pages 9–20, 2008.

5. Probabilistic counting algorithms for data base applications

Cited by 107 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. MapReduce algorithms for robust center-based clustering in doubling metrics;Journal of Parallel and Distributed Computing;2024-12

2. Alignment and comparison of directed networks via transition couplings of random walks;Journal of the Royal Statistical Society Series B: Statistical Methodology;2024-09-10

3. Model orthogonalization and Bayesian forecast mixing via principal component analysis;Physical Review Research;2024-09-09

4. Estimation of Skill Distributions;IEEE Transactions on Information Theory;2024-09

5. Utilizing machine learning techniques for enhanced water quality monitoring;Water Quality Research Journal;2024-08-30

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3