Linear Regression
[!def] Linear Regression Formula
$$
g(x) = \alpha c + \beta
$$
$$
\alpha = \frac{\sum_i x_i y_i - \sum_i x_i \sum_i y_i} { \sum_i x_i^2 - (\sum_i x_i)^2}
$$
$$
\beta = \frac{1}{n} \sum_i y_i - \alpha \frac{1}{n} \sum_i x_i
$$
$$
Loss = \sum_i (g(x_i) - y_i)
$$
- It is called Linear Regression because it's a linear combination of the parameters ($\alpha$, $\beta$)
- The $\alpha$ and $\beta$ can be computed by setting the Derivative of loss to 0
- Actually the closed-form solution has come from that only
- Advantages:
- Simple
- Tractable
- Extensible - can be extended to polynomial for non-linearity
- Interpretable
[!question] In Linear Regression, what are the assumptions?
- Multivariate Normality
- No correlation
- Have Linear Relationship
- No or little Multicollinearity
- Homoscedasticity