Lecture Slides
Last time
Kernel Methods
- Mapped feature space
- If dimension of mapping higher, it would increase computation and overfitting
- How do we get advantages while doing same computation
- Can we compute inner product easily?
- If we can express computing just in inner product, we can get efficiency
How do we kernelize algorithms
- Kernelized ridge regression: $\Phi$ matrix of transformed features
- We find that the solution is just a linear combination of the features
- So we can find the coefficients
- We still have weights in the terms of higher dimension
- But we can do prediction in terms of inner products as well
Other kernel functions
- RBF has dimension 0
- Polynomial has dimension D choose d
- A kernel function needs to be positive semi definite
- Also known as mercer’s theorem
Rules of composing kernel functions
- Linearity (scalar greater than 0)
- Multiplication
- $e^{k}$ where $k$ is a kernel function
Support Vector Machines