Mutual Information

  • Mutual Information (MI) shows how closely two distributions are related
  • $MI(input_i, output)$ shows how $input_i$ has more contribution to the output than other
  • More MI means more contribution
  • If the data is continuous, use beans to make them discrete
  • Range: $[0, \infty]$

[!def] Mutual Information = ?
$$
\begin{align*}
MI &= \sum_{x, y} P(x, y) log \frac{P(x, y)}{P(x) P(y)} \\ > &= \sum_x \sum_y P(x, y) log \frac{P(x, y)}{P(x) P(y)}
\end{align*}
$$