Contingency Table - Random Notes Go Brrrrrrr

> [!tldr] Contingency Table > Given a dataset $\mathbf{x}=(x^{(1)},\dots,x^{(N)})$ of realizations of some discrete random variable $X$ taking values in $\mathcal{X}$, the **contingency table** is a function $n:\mathcal{X} \mapsto \mathbb{N}$ that counts the number of times a particular value $x$ is found in the dataset $\mathbf{x}$: $n(x;\mathbf{x}):=\sum_{n=1}^{N}\mathbf{1}\{ x^{(i)}=x \}.$ > > Assuming $\mathbf{x}$ is an iid. sample of $X$, we have (for each fixed $x$) $n(x;\mathbf{x})\sim \mathrm{Binom}(n, p(x))$ where $p(x):=\mathbb{P}[X=x]$. Equivalently, (over all possible $x$) $n(x;\mathbf{x})\sim \mathrm{Multinom}(n, \pi)$, where $\pi$ is a vector recording the joint distribution of $x$ with $\pi_{x}=p(x)$. In cases where $X=(X_{1},X_{2})$, with possible values of $X_{1}$ being $\{x_{11},x_{12},\dots \}$, and similar for $X_{2}$, the contingency table can actually be arranged into a table, where the $(i,j)$ cell contains $n((x_{1i},x_{2i}))$. For example, consider the dataset of predicted and achieved classes, with possible values $\mathcal{X}=\{ 1,2,3 \}^{2}$: $ (1,1) ,(1,2),(2,1),(2,2), (1,1),(2,1),(3,2),(3,3)$their contingency table is $\begin{array}{c|c} &\text{achieved}\\ \text{predicted}& \begin{matrix} 1 & 2 & 3\end{matrix} \\ \hline\begin{matrix} 1 \\ 2 \\ 3 \end{matrix} & ~~~~\begin{matrix} 2 & 1 & 0 \\ 2 & 1 & 0 & \\0 & 1 & 1 \end{matrix} \end{array}$