Rejection Sampling - Random Notes Go Brrrrrrr

Suppose we want to sample $X \sim p$, but the pdf $p$ is difficult to simulate; but we have another variable $Y \sim q$ that is simpler to sample. Instead of manipulating $Y$ to behave like $X$ (like in transformation sampling), rejection sampling *"rejects" values of $Y$ to make the remainder representative of a sample from the target $X \sim p$.* - $Y \sim q$ is then called the **proposal (distribution)**, while $X \sim p$ is the **target (distribution)**. - To make the proposal look like the target, we need to rescale its distribution: at $Y=y$, it should be rescaled by $p(y) / q(y)$. This is done by *randomly accepting the proposal with some probability*$L \propto p(y) / q(y)$ . However, for $L$ to be a valid probability, it needs to be in $[0,1]$. Hence find some $M: p / q \le M$ everywhere, then define the threshold $\text{threshold } L(y):=\frac{p(y)}{q(y)} \frac{1}{M} \in [0,1]$ - $M$ is a constant bound, so rejection sampling needs a proposal $q$ such that $p / q$ is bounded. - To accept the proposal with this probability, we then simulate $U \sim U[0,1]$ and accept if and only if $U \le L$. > [!algorithm] Rejection Sampling > The **rejection sampling** algorithm simulates $X \sim p$ by: > > $[1]$ Sample $Y \sim q$, giving $Y=y$. > > $[2]$ Sample $U \sim U [0,1]$. > > $[3]$ If $U \le p(y) / Mq(y)$, accept and return $X=y$. Otherwise reject and return to $[1]$. In each iteration, the proposal is accepted with probability $\frac{1}{M}$. - Hence the number of iterations to simulate one instance follows $\mathrm{Geom}\left( \frac{1}{M} \right)$, with expectation of $M$ iterations per output. - If $M$ is too large (i.e. $p / q$ is too large at some points), rejection sampling is very inefficient. > [!proof]- Proof of Acceptance Probability > Assuming $q \ne 0$ (at least where $p \ne 0$), $\mathbb{P}[\text{accept}]=\int q(y)\mathbb{P}[\text{accept}|Y=y] \, dy=\int q(y)\frac{p(y)}{Mq(y)} \, dy=\frac{1}{M}$where $p(y)$ integrates to $1$. The rejection algorithm returns $X \sim p$. > [!proof] > Let $N$ be the random variable indicating the iteration at which the proposal is accepted. Let $A\subset \mathcal{X}$ be any (measurable) set. > $\begin{align*} \mathbb{P}[X\in A, N=n ]&= \mathbb{P}[n-1\text{ rejections}]\cdot \mathbb{P}[X\in A \text{ proposed and accepted}]\\ &= \left( 1-\frac{1}{M} \right)^{n-1}\int _{A}\mathbb{P}[\text{accept} ~|~ X=x \text{ proposed}] ~ dq(x) \\ &= \left( 1-\frac{1}{M} \right)^{n-1} \frac{1}{M}\cdot\left\{\begin{array}{} \int _{A} \frac{p(x)}{q(x)}\cdot q(x) dx & \text{if continuous} \\ \sum_{x\in A} \frac{p(x)}{q(x)}q(x) & \text{if discrete} \end{array}\right\}\\[0.8em] &= p(A) \cdot \left( 1 - \frac{1}{M} \right)^{n-1} \frac{1}{M}. \end{align*}$ > Therefore, $X$ is independent of $N$, and has distribution $p$. Alternatively, we also get $X \sim p$ by marginalising over $N\sim \mathrm{Geom}(M^{-1})$. ### Rejection Sampling Without Normalisation *In many cases, we only know that $p,q$ are proportional to functions $\tilde{p}=p \cdot c_{p}$ and $\tilde{q}=q \cdot c_{q}$ but not the normalising constants $c_{p}$ and $c_{q}$*. To replace $M$, we look for $\tilde{M}: \frac{\tilde{p}}{\tilde{q}}\le \tilde{M} \text{ everywhere}$and accept with probability $\frac{\tilde{p}}{\tilde{M} \tilde{q}}$. This is equivalent to a (normalised) rejection sampling with acceptance probability $\frac{p}{q}\cdot \frac{c_{p}}{\tilde{M}c_{q}}$, i.e. $M:= \tilde{M} c_{q} / c_{p}$. - As an extremely basic example, let $\tilde{p}(x):= \mathbf{1}\{x\in A\}$ and $\tilde{q}(x):= \mathbf{1}\{x\in B\}$ for some $A \subset B$ (so that $\tilde{p} / \tilde{q}$ is bounded). Then we can choose $\tilde{M}=1$. - In this case the average success probability for each iteration is not $\tilde{M}$, but $M=c_{p} / \tilde{M}c_{q}$. - In choosing proposals that maximise the success probability, note that both $\tilde{M}$ and $c_{q}$ depends on the proposal, so it's not enough to just maximize $\tilde{M}$.