Foundations of Machine Learning-西安交通大学图书馆

Foundations of Machine Learning

发布日期:2015-07-02 浏览次

Foundations of Machine Learning

[BOOK DESCRIPTION]

This graduate-level textbook introduces fundamental concepts and methods in machine learning. It describes several important modern algorithms, provides the theoretical underpinnings of these algorithms, and illustrates key aspects for their application. The authors aim to present novel theoretical tools and concepts while giving concise proofs even for relatively advanced topics. Foundations of Machine Learning fills the need for a general textbook that also offers theoretical details and an emphasis on proofs. Certain topics that are often treated with insufficient attention are discussed in more detail here; for example, entire chapters are devoted to regression, multi-class classification, and ranking. The first three chapters lay the theoretical foundation for what follows, but each remaining chapter is mostly self-contained. The appendix offers a concise probability review, a short introduction to convex optimization, tools for concentration bounds, and several basic properties of matrices and norms used in the book. The book is intended for graduate students and researchers in machine learning, statistics, and related areas; it can be used either as a textbook or as a reference text for a research seminar.

[TABLE OF CONTENTS]

Preface p. xi

Introduction p. 1

Applications and problems p. 1

Definitions and terminology p. 3

Cross-validation p. 5

Learning scenarios p. 7

Outline p. 8

The PAC Learning Framework p. 11

The PAC learning model p. 11

Guarantees for finite hypothesis sets - consistent case p. 17

Guarantees for finite hypothesis sets - inconsistent case p. 21

Generalities p. 24

Deterministic versus stochastic scenarios p. 24

Bayes error and noise p. 25

Estimation and approximation errors p. 26

Model selection p. 27

Chapter notes p. 28

Exercises p. 29

Rademacher Complexity and VC-Dimension p. 33

Rademacher complexity p. 34

Growth function p. 38

VC-dimension p. 41

Lower bounds p. 48

Chapter notes p. 54

Exercises p. 55

Support Vector Machines p. 63

Linear classification p. 63

SVMs - separable case p. 64

Primal optimization problem p. 64

Support vectors p. 66

Dual optimization problem p. 67

Leave-one-out analysis p. 69

SVMs - non-separable case p. 71

Primal optimization problem p. 72

Support vectors p. 73

Dual optimization problem p. 74

Margin theory p. 75

Chapter notes p. 83

Exercises p. 84

Kernel Methods p. 89

Introduction p. 89

Positive definite symmetric kernels p. 92

Definitions p. 92

Reproducing kernel Hilbert space p. 94

Properties p. 96

Kernel-based algorithms p. 100

SVMs with PDS kernels p. 100

Representer theorem p. 101

Learning guarantees p. 102

Negative definite symmetric kernels p. 103

Sequence kernels p. 106

Weighted transducers p. 106

Rational kernels p. 111

Chapter notes p. 115

Exercises p. 116

Boosting p. 121

Introduction p. 121

AdaBoost p. 122

Bound on the empirical error p. 124

Relationship with coordinate descent p. 126

Relationship with logistic regression p. 129

Standard use in practice p. 129

Theoretical results p. 130

VC-dimension-based analysis p. 131

Margin-based analysis p. 131

Margin maximization p. 136

Game-theoretic interpretation p. 137

Discussion p. 140

Chapter notes p. 141

Exercises p. 142

On-Line Learning p. 147

Introduction p. 147

Prediction with expert advice p. 148

Mistake bounds and Halving algorithm p. 148

Weighted majority algorithm p. 150

Randomized weighted majority algorithm p. 152

Exponential weighted average algorithm p. 156

Linear classification p. 159

Perceptron algorithm p. 160

Winnow algorithm p. 168

On-line to batch conversion p. 171

Game-theoretic connection p. 174

Chapter notes p. 175

Exercises p. 176

Multi-Class Classification p. 183

Multi-class classification problem p. 183

Generalization bounds p. 185

Uncombined multi-class algorithms p. 191

Multi-class SVMs p. 191

Multi-class boosting algorithms p. 192

Decision trees p. 194

Aggregated multi-class algorithms p. 198

One-versus-all p. 198

One-versus-one p. 199

Error-correction codes p. 201

Structured prediction algorithms p. 203

Chapter notes p. 206

Exercises p. 207

Ranking p. 209

The problem of ranking p. 209

Generalization bound p. 211

Ranking with SVMs p. 213

RankBoost p. 214

Bound on the empirical error p. 216

Relationship with coordinate descent p. 218

Margin bound for ensemble methods in ranking p. 220

Bipartite ranking p. 221

Boosting in bipartite ranking p. 222

Area under the ROC curve p. 224

Preference-based setting p. 226

Second-stage ranking problem p. 227

Deterministic algorithm p. 229

Randomized algorithm p. 230

Extension to other loss functions p. 231

Discussion p. 232

Chapter notes p. 233

Exercises p. 234

Regression p. 237

The problem of regression p. 237

Generalization bounds p. 238

Finite hypothesis sets p. 238

Rademacher complexity bounds p. 239

Pseudo-dimension bounds p. 241

Regression algorithms p. 245

Linear regression p. 245

Kernel ridge regression p. 247

Support vector regression p. 252

Lasso p. 257

Group norm regression algorithms p. 260

On-line regression algorithms p. 261

Chapter notes p. 262

Exercises p. 263

Algorithmic Stability p. 267

Definitions p. 267

Stability-based generalization guarantee p. 268

Stability of kernel-based regularization algorithms p. 270

Application to regression algorithms: SVR and KRR p. 274

Application to classification algorithms: SVMs p. 276

Discussion p. 276

Chapter notes p. 277

Exercises p. 277

Dimensionality Reduction p. 281

Principal Component Analysis p. 282

Kernel Principal Component Analysis (KPCA) p. 283

KPCA and manifold learning p. 285

Isomap p. 285

Laplacian eigenmaps p. 286

Locally linear embedding (LLE) p. 287

Johnson-Lindenstrauss lemma p. 288

Chapter notes p. 290

Exercises p. 290

Learning Automata and Languages p. 293

Introduction p. 293

Finite automata p. 294

Efficient exact learning p. 295

Passive learning p. 296

Learning with queries p. 297

Learning automata with queries p. 298

Identification in the limit p. 303

Learning reversible automata p. 304

Chapter notes p. 309

Exercises p. 310

Reinforcement Learning p. 313

Learning scenario p. 313

Markov decision process model p. 314

Policy p. 315

Definition p. 315

Policy value p. 316

Policy evaluation p. 316

Optimal policy p. 318

Planning algorithms p. 319

Value iteration p. 319

Policy iteration p. 322

Linear programming p. 324

Learning algorithms p. 325

Stochastic approximation p. 326

TD(0) algorithm p. 330

Q-learning algorithm p. 331

SARSA p. 334

TD(¿) algorithm p. 335

Large state space p. 336

Chapter notes p. 337

Conclusion p. 339

Linear Algebra Review p. 341

Vectors and norms p. 341

Norms p. 341

Dual norms p. 342

Matrices p. 344

Matrix norms p. 344

Singular value decomposition p. 345

Symmetric positive semidefinite (SPSD) matrices p. 346

Convex Optimization p. 349

Differentiation and unconstrained optimization p. 349

Convexity p. 350

Constrained optimization p. 353

Chapter notes p. 357

Probability Review p. 359

Probability p. 359

Random variables p. 359

Conditional probability and independence p. 361

Expectation, Markov's inequality, and moment-generating function p. 363

Variance and Chebyshev's inequality p. 365

Concentration inequalities p. 369

Hoeffding's inequality p. 369

McDiarmid's inequality p. 371

Other inequalities p. 373

Binomial distribution: Slud's inequality p. 374

Normal distribution: tail bound p. 374

Khintchine-Kahane inequality p. 374

Chapter notes p. 376

Exercises p. 377

Notation p. 379

References p. 381

Index p. 397

上一条：Electric Power Substations Engineering
下一条：Organic Photovoltaics: Materials, Device Physics, and Manufacturing Technologies

【关闭】