Linear Regression with one variable
-
Hypothesis: \(h(\theta) = {\theta}_{0} + {\theta}_{1} x_{1}\)
-
Cost function: \(J(\theta_{0},\theta_{1})=\frac{1}{2m}\sum_{i=1}^{n}(h_{\theta}(x^{(i)})-y^{(i)})^2\)
Gradient descent
Target: minimize \(J(\theta_{0},\theta_{1})\) or \(J(\theta_{0},\theta_{1},\cdots,\theta_{n})\)
- Gradient algorithm:
Repeat until convergence{
\( \theta_{j}:=\theta_{j}-\alpha\frac{\partial }{\partial j}J(\theta_{0},\theta_{1})\;(for \: j=0\:and\:j=1)\)
}
( \(:=\) - Assignment \(\alpha\) - Learning rate )
Warning: \(\theta_{0}\) and \(\theta_{1}\)should be updated Simultaneously !!!
Especially, when gradient descent for linear regression,
\(\frac{\partial }{\partial j}J(\theta_{0},\theta_{1})=\sum_{i=1}^{n}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}\)
when \(i=0\), we suppose \(x^{(0)}=1\)