Linear Regression

[!def] Linear Regression Formula
$$
g(x) = \alpha c + \beta
$$
$$
\alpha = \frac{\sum_i x_i y_i - \sum_i x_i \sum_i y_i} { \sum_i x_i^2 - (\sum_i x_i)^2}
$$
$$
\beta = \frac{1}{n} \sum_i y_i - \alpha \frac{1}{n} \sum_i x_i
$$
$$
Loss = \sum_i (g(x_i) - y_i)
$$

  • It is called Linear Regression because it's a linear combination of the parameters ($\alpha$, $\beta$)
  • The $\alpha$ and $\beta$ can be computed by setting the Derivative of loss to 0
    • Actually the closed-form solution has come from that only
  • Advantages:
    • Simple
    • Tractable
    • Extensible - can be extended to polynomial for non-linearity
    • Interpretable

[!question] In Linear Regression, what are the assumptions?

  1. Multivariate Normality
  2. No correlation
  3. Have Linear Relationship
  4. No or little Multicollinearity
  5. Homoscedasticity