Last time
- SVM
- Maximizes margin—get the min distance dot and maximize it
- Can scale by numerator to create a constrained optimization problem
- Minimum of $\ell_2$ of weights such that perceptron points greater than 1
- For non separable data
- “Soft-margin” SVM
- Get min of weights + “slack”
- What is the loss function?
- Hinge loss
SVM: Hinge Loss
SVM is the linear classifier that minimizes the total hinge loss on all the training data.
$\mathrm{min}_{w,b}\sum \mathrm{max}(0, 1 - y_n[w^T\phi(x_n)+b]) + \dfrac{\lambda}{2}||w||^2_2$, same as geometric solution?
Kernelizing SVMs
Preview:
- Solving its “dual” problem, which is kernelizable
- Show dual problem gives some solution as the original
- “Primal” and “Dual” problems, strong duality
Constrained Optimization
Using Lagrange Multipliers.
“Original” / “Primal” Problem:
$\mathrm{min}_x f_0(x)$
Such that: $f_i(x) \leq 0$ for $i$ to $m$
$h_j(x) = 0$ for $j$ to $p$
(Constrained set $\mathcal{C}$)