> [!tldr] Mutual Information
> Mutual information (MI) measures the strength of association between two variables $X, Y$, with a value between $0$ (independence) and $1$ (one is a function of another).
>
> It is defined as
> $I(X;Y):= \mathrm{KL}(P_{X,Y} ~\|~P_{X}P_{Y} ),$where $\mathrm{KL}$ is the [[Kullback–Leibler Divergence]], and $P_{X,Y}$ is their joint distribution, $P_{X}, P_{Y}$ are their marginal distributions, and $P_{X}P_{Y}$ is the product measure pretending the two are independent, i.e. $(P_{X}P_{Y})(x,y):= P_{X}(x)P_{Y}(y)$.
[Paper estimating MI using KNN](https://arxiv.org/pdf/cond-mat/0305641)