Gradient Descent For Multiple Variables
Cost function: $J(\theta) = \dfrac {1}{2m} \displaystyle \sum_{i=1}^m \left (h_\theta (x^{(i)}) – y^{(i)} \right)^2$
$J(\theta) = \dfrac {1}{2m} \displaystyle \sum_{i=1}^m \left (\theta^Tx^{(i)} – y^{(i)} \right)^2$
$J(\theta) = \dfrac {1}{2m} \displaystyle \sum_{i=1}^m \left ( \left( \sum_{j=0}^n \theta_j x_j^{(i)} \right) – y^{(i)} \right)^2$
Gradient descent: $\begin{align*}
& \text{repeat until convergence:} \; \lbrace \newline
\; & \theta_j := \theta_j – \alpha \frac{1}{m} \sum\limits_{i=1}^{m} (h_\theta(x^{(i)}) – y^{(i)}) \cdot x_j^{(i)} \; & \text{for j := 0..n}
\newline \rbrace
\end{align*}$
which breaks down into
$\begin{align*}
& \text{repeat until convergence:} \; \lbrace \newline
\; & \theta_0 := \theta_0 – \alpha \frac{1}{m} \sum\limits_{i=1}^{m} (h_\theta(x^{(i)}) – y^{(i)}) \cdot x_0^{(i)}\newline
\; & \theta_1 := \theta_1 – \alpha \frac{1}{m} \sum\limits_{i=1}^{m} (h_\theta(x^{(i)}) – y^{(i)}) \cdot x_1^{(i)} \newline
\; & \theta_2 := \theta_2 – \alpha \frac{1}{m} \sum\limits_{i=1}^{m} (h_\theta(x^{(i)}) – y^{(i)}) \cdot x_2^{(i)} \newline
& \cdots
\newline \rbrace
\end{align*}$