Last time

SVM
- Maximizes margin—get the min distance dot and maximize it
- Can scale by numerator to create a constrained optimization problem
- Minimum of $\ell_2$ of weights such that perceptron points greater than 1
For non separable data
- “Soft-margin” SVM
- Get min of weights + “slack”
- What is the loss function?
- Hinge loss
  - No negative loss

SVM: Hinge Loss

SVM is the linear classifier that minimizes the total hinge loss on all the training data.

$\mathrm{min}_{w,b}\sum \mathrm{max}(0, 1 - y_n[w^T\phi(x_n)+b]) + \dfrac{\lambda}{2}||w||^2_2$, same as geometric solution?

Preview:

Using Lagrange Multipliers.

“Original” / “Primal” Problem:

$\mathrm{min}_x f_0(x)$

Such that: $f_i(x) \leq 0$ for $i$ to $m$

$h_j(x) = 0$ for $j$ to $p$

(Constrained set $\mathcal{C}$)