> [!tldr]
> Rank-based tests are **distribution-free** hypothesis tests based on rank statistics -- they do not assume any underlying distribution to work.
Suppose we have an sample $\mathbf{Z}=(Z_{1},\dots,Z_{N})$ and wish to test for a position parameter $\Delta$, but we are unsure of the population distribution.
- Tests that assume a certain distribution (e.g. **z-tests** and **t-tests** that assume normality) will give incorrect p-values if the assumption is badly off.
- Hence a statistic that is **distribution-free** is needed.
## Ranks and Exchangeability
> [!tldr]
> If the sample $\mathbf{Z}$ is **exchangeable**, then its **ranks** (and statistics that are solely dependent on the ranks) are distribution-free, uniform, and immune to monotone transformations.
> [!warning] For simplicity, we assume there are no ties in the data.
> [!definition|*] Ranks
> The **ranks** $R(\mathbf{Z})=(R_{1},\dots,R_{N})$ is a statistic of a sample $\mathbf{Z}$. The rank of the observation $Z_{i}$ is the number of observations no larger than it: $R_{i}:= \sum_{j=1}^{N}\mathbf{1}_{Z_{j} \le Z_{i}}.$Equivalently, $R_{i}$ is the ($1$-based) index $j$ such that the order statistic $Z_{(j)}=Z_{i}$.
One useful way of treating ranks is a function $R$ mapping $i$ to the rank of $Z_{i}$ (where the sample goes after sorting); it has an inverse map $D$ if there are no ties, which maps the rank $R_{i}$ to the index $i$ of the sample that has the rank (where the sample was before sorting).
- For example, if $\mathbf{Z}=(1,10,4,2)$, then $R=(1,4,3,2)$, and $D=(1,4,3,2)$.
We wish to say that ranks are distribution-free and *all orders are equally likely*. That is, $R(\mathbf{Z}) \sim U[\mathcal{P}_{N}]$, where $\mathcal{P}_{N}$ is the set of all possible permutations of $\{ 1,\dots,N \}$.
- That is obviously not true in general, e.g. when the samples are sorted before the analysis.
- IID samples have distribution-free ranks, but we can in fact relax the requirement:
> [!definition|*] Exchangeability
> A sample $\mathbf{Z}=(Z_{1},\dots,Z_{N})$ with joint density $f_{1:N}$ is **exchangeable** if permuting the inputs does not change the distribution: $Z_{1},\dots,Z_{N} \sim Z_{\sigma1},\dots,Z_{\sigma N}$for any permutation $\sigma$ on $\{ 1,\dots,N \}$. For continuous variables, this is equivalent with $
\forall (z_{1},\dots,z_{N}),\,\,\,
f_{1:N}(z_{1},\dots,z_{N})=f_{1:N}(z_{\sigma1},\dots,z_{\sigma N}),$where $f_{1:N}$ is the joint density of $\mathbf{Z}$.
- IID implies exchangeability, but the converse is not true: for example $X_{1},X_{2} \sim N(0,1)$ with correlation $\rho \ne 0$ are also exchangeable.
- One necessary condition for exchangeability is that *all $N$ marginal distributions of $Z_{1},\dots,Z_{N}$ must be the same* (when computing the marginal distribution of $Z_{i}$, exchanging it with $Z_{j}$ when integrating out all $Z_{k \ne i}$ gives the marginal of $Z_{j}$ instead).
> [!theorem|*] Uniformity of Ranks in Exchangeable Samples
> If the observations $\mathbf{Z}$ are exchangeable, then their ranks are uniformly distributed: $R(\mathbf{Z}) \sim U[\mathcal{P}_{N}]$.
>
> > [!proof]-
> > Consider any $r = \mathcal{P}_{N}$, and its sorted indices $d:= D(r)$. Then $\{ R=r \}=\{ Z_{d_{1}}<Z_{d_{2}}<\dots<Z_{d_{N}} \},$and the have probability
> > $\begin{align*}
\mathbb{P}[Z_{d_{1}}<\dots<Z_{d_{N}}]&= \int \mathbf{1}\{ z_{d_{1}}<\dots<z_{d_{N}} \}\cdot f_{\mathbf{Z}}(z_{1},\dots,z_{N}) ~ d\mathbf{z}\\
&= \int \mathbf{1}\{ s_{1}<\dots<s_{N} \}\cdot f_{\mathbf{Z}}(s_{r_{1}},\dots, s_{r_{N}}) ~ d\mathbf{s}\\
&\hphantom{\mathbf{1}\{ s_{1}<\dots<s_{N} \}}[s_{i}:=z_{d_{i}},\text{ so }s_{r_{i}}=z_{i}]\\
&= \int \mathbf{1}\{ s_{1}<\dots<s_{N} \}\cdot f_{\mathbf{Z}}(s_{1},\dots, s_{N}) ~ d\mathbf{s}\\
&\hphantom{\mathbf{1}\{ s_{1}<\dots<s_{N} \}}[\text{exchangeability of }f_{\mathbf{Z}}]\\[0.4em]
&= \mathbb{P}[Z_{1}<\dots<Z_{N}].
\end{align*}$
Therefore, all ranks are equally probable, and $R \sim U[\mathcal{P}_{N}]$.
As a result, ranks have the following desirable properties:
- It is **distribution-free** or **ancillary** over the family of exchangeable distributions $\mathcal{D}$, meaning that $R(\mathbf{Z})$ has the same distribution for any $D \in \mathcal{D}$, $\mathbf{Z} \overset{\mathrm{iid.}}{\sim} D$.
- It is invariant under monotone transformations on the data (as long as they do not produce ties); so do test statistics $T(R(\mathbf{Z}))$ that only depends on the ranks.
- It is [[Robustness|robust]] against outliers, similar to the median.
> [!theorem|*] Independence of Ranks and Order Statistics in Exchangeable Samples
> Furthermore, if $\mathbf{Z}$ is exchangeable, then their *ranks $R(\mathbf{Z})$ and order statistics $(Z_{(1)},\dots,Z_{(N)})$ are independent.*
> > [!proof]
> > Write $\phi(\mathbf{Z}):= (Z_{(1)},\dots,Z_{(N)},R(\mathbf{Z}))$, and let $\sigma$ be any permutation on $\{ 1,\dots ,N \}$.
> >
> > Since $\mathbf{Z} \sim \sigma \mathbf{Z}$ are identically distributed, so are $\phi(\mathbf{Z})$ and $\phi(\sigma \mathbf{Z})$. In particular, conditioning on the rank $\{ R=r \},$ $\begin{align*}
(Z_{(1)},&\dots,Z_{(N)})~|~\{ R(\mathbf{Z})=r \} \\
&~~\sim~~(Z_{(1)},\dots,Z_{(N)}) ~|~ \{ R(\sigma \mathbf{Z}) =r\}.
\end{align*}$We want to show that $\mathrm{LHS}$ has the same distribution regardless of the choice of $r$.
> >
> > Now choose $\sigma:=r$, i.e. mapping $i$ to $r_{i}$, so $r\mathbf{Z}=(Z_{r_{1}},\dots,Z_{r_{N}})$, i.e. the sorted sample. Then $R(r\mathbf{Z})=(1,\dots,N)$, and $\mathrm{RHS}$ becomes $\begin{align*}
(Z_{(1)},&\dots,Z_{(N)})~|~\{ R(r\mathbf{Z}) =(1,\dots,N)\}\\
&\sim (Z_{(1)},\dots,Z_{(N)})~|~\{ R(\mathbf{Z}) =(1,\dots,N)\}
\end{align*}$which does not depend on $r$. Therefore, $\mathrm{LHS}$ is also independent of the choice of $r$, and so $(Z_{(1)},\dots,Z_{(N)})$ is independent of $r$.
## General Setup of Rank-Based Tests
Assuming interchangeability in $\mathbf{Z}$, so that $R(\mathbf{Z}) \sim U[\mathcal{P}_{N}]$, any statistic $T(R(\mathbf{Z}))$ that is a function of the ranks is also ancillary over $\mathcal{D}$ -- it has the distribution $T(S), S \sim U[\mathcal{P}_{N}]$.
Therefore, a rank-based test with test statistic $T(R(\mathbf{Z}))$ computes the observed value $T(R(\mathbf{z}))$ and *tests its compatibility with some null hypothesis $H_{0}$ under which the data is exchangeable*.
For example, if the test statistic takes extremely large values when $H_{0}$ is false, the rank-based test has **p-value** $p:= \mathbb{P}[T(R(\mathbf{Z})) \ge
T(R(\mathbf{z}))]=\frac{1}{N!}\sum_{s \in \mathcal{P}_{N}}\mathbf{1}\{T(s) \ge T(R(\mathbf{z}))\}.$
Alternatively, the test can use appropriate **quantiles** as cut-offs: e.g. to have a one-sided test of (nominal) level $\alpha$, find the $\alpha$-quantile $w_{\alpha}$ as the cut-off of the critical region $(-\infty, w]$.
- The pitfalls of the discrete quantiles are noted in [[#Nominal vs. Actual Levels]].
## Two-Sample Permutation Tests
Suppose we want to decide whether two samples $\mathbf{X}=(X_{1},\dots,X_{n}) \overset{\mathrm{iid.}}{\sim} F_{X}$ and $\mathbf{Y} = (Y_{1},\dots,Y_{m}) \overset{\mathrm{iid.}}{\sim} F_{Y}$ have the same distribution.
In particular, we test whether there is a **location shift** $\Delta$ such that $X_{i},\, (Y_{j} - \Delta) \overset{\mathrm{iid.}}{\sim}F$for $i=1,\dots,n$, and $j = 1,\dots,m$. The hypotheses are just $\begin{align*}
H_{0}: \Delta &= 0,\\
H_{1}:\Delta &\ne 0,\, \Delta >0,\, \Delta<0,\, \text{etc...}
\end{align*}$
To do so, we compute rank-based statistics on the concatenated sample $\mathbf{Z} = (X_{1},\dots,X_{m}, Y_{1},\dots,Y_{n})$ of size $N:= m+n$. *Under $H_{0}$, the sample is iid., hence exchangeable.*
### Wilcoxson Rank Sum Test
The WRST is tests the hypotheses with the **Wilcoxson rank sum test statistic**, i.e. the sum of the ranks of $\mathbf{Y}$: $W(\mathbf{R}):= \sum_{j=n+1}^{N}R_{j}.$
Under $H_{0}:\Delta=0$, exchangeability gives $R(\mathbf{Z}) \sim U(\mathcal{P}_{N})$. In contrast, *a large, positive $W(\mathbf{R})$ indicates a positive shift, and negative $W$ a negative shift*.
The WRST then uses the test statistic $W(\mathbf{R}^\mathrm{obs})$ to construct a test with:
- *p-values*: e.g. two-tailed test has p-value being $p:=\mathbb{P}\Big[| W(\mathbf{R}) - \mu_{m,n} | \ge | W(\mathbf{R}^{\mathrm{obs}})- \mu_{m,n} |\Big],$where $\mu_{m,n}= m(m + n + 1) /2=\mathbb{E}[W(\mathbf{R}) \,|\, H_{0}]$.
- *quantiles*: e.g. to have a one-sided test of (nominal) level $\alpha$, find the $\alpha$-quantile $w_{\alpha}$ of $W(\mathbf{R}) ~|~ H_{0}$.
A very similar concept is the **Mann-Witney test statistic**, defined by $\mathrm{MW}(\mathbf{X}, \mathbf{Y}):= \sum_{i,j}\mathbf{1}_{Y_{j} > X_{i}}, $and for the same dataset, $\mathrm{MW}=W - m(m+1) / 2$, so they give the same test.
- Using this formulation, it is easy to see that under $H_{0}$, $\mathbb{E}[\mathrm{MW}]=mn / 2$, and $\mathbb{E}[W] = m(m+n+1) / 2$.
> [!proof]- Proof of $\mathrm{MW}=W - m(m+1) / 2$
> Reorganize the sum in $\mathrm{MW}$ to be over each $Y_{j}$: it equals summing the ranks of $Y_{j}$, but ignoring its ranking among other $Y_{k}