Concise Machine Learning

Concise Machine Learning – Jonathan Richard Shewchuk – Department of Electrical Engineering and Computer Sciences University of California at Berkeley, California 94720 – May 5, 2025

Continua la lettura del PDF (150 MB EN)

This report contains lecture notes for UC Berkeley’s introductory class on Machine Learning. It covers
many methods for classification and regression, including five and a half lectures on neural networks, and
a few methods for clustering and dimensionality reduction. It is concise because nothing is included that
cannot be written or spoken in a single semester’s lectures (with whiteboard lectures and almost no slides!)
and because the choice of topics is limited to a small selection of particularly useful, popular algorithms.

Contents

1 – Introduction; Classification; Train, Validate, Test
2 – Linear Classifiers, the Centroid Method, and Perceptrons
3 – Perceptron Learning; Maximum Margin Classifiers
4 – Soft-Margin Support Vector Machines; Features
5 – Machine Learning Abstractions and Numerical Optimization
6 – Decision Theory; Generative and Discriminative Models
7 – Gaussian Discriminant Analysis; Maximum Likelihood Estimation
8 – Eigenvectors and the (Anisotropic) Multivariate Normal Distribution
9 – Anisotropic Gaussians: MLE, QDA, and LDA Revisited
10 – Regression, including Least-Squares Linear and Logistic Regression
11 – Polynomial and Weighted Regression; Newton’s Method; ROC Curves
12 – Statistical Justifications; the Bias-Variance Decomposition
13 – Shrinkage: Ridge Regression, Subset Selection, and Lasso
14 – Decision Trees
15 – More Decision Trees, Ensemble Learning, and Random Forests
16 – Neural Networks
17 – Vanishing Gradients; ReLUs; Output Units and Losses; Neurobiology
18 – Neurobiology; Faster Neural Network Training
19 – Convolutional Neural Networks
20 – Unsupervised Learning: Principal Components Analysis
21 – The Singular Value Decomposition; Clustering
22 – The Pseudoinverse; Better Generalization for Neural Nets
23 – Residual Networks; Batch Normalization; AdamW
24 – Boosting; Nearest Neighbor Classification
25  – Nearest Neighbor Algorithms: Voronoi Diagrams and k-d Trees

A – Bonus Lecture: Learning Theory
B – Bonus Lecture: The Kernel Trick
C  – Bonus Lecture: Spectral Graph Clustering
D – Bonus Lecture: Multiple Eigenvectors; Latent Factor Analysis
E – Bonus Lecture: High Dimensions; Random Projection

Continua la lettura del PDF (150 MB EN)