Consider a collection of models $\mathcal{M}$, where each model $m \in \mathcal{M}$ has its own set of parameters $\theta \sim \pi(\theta ~|~ m)$, but the dimensions of $\theta ~|~ m$ are in general different, denoted $p_{m}$. - For example, a regression setting $Y \sim X^{T}\theta$ where $\theta$ performs variable selection, so $p_{m}$ is the number of features selected by $m$. We wish to use MCMC to sample from the distribution $\pi(\theta,m)$, so our samples are $X^{(t)}=(\theta^{(t)}, m^{(t)})$. This makes traditional MCMC (like vanilla [[Metropolis Hastings]]) difficult, since they assume the sample space $\Omega$ to have a constant dimension for reversibility, but the coefficients $\theta^{(t)}$ has dimension $p_{m^{(t)}}$, which changes over time: $\theta^{(t)}=\begin{bmatrix} \theta^{(t)}_{1} \\ \vdots \\ \theta^{(t)}_{p_{m^{(t)}}} \end{bmatrix} \xrightarrow{???} \begin{bmatrix} \theta^{(t+1)}_{1} \\ \vdots \\ \vdots \\ \theta^{(t+1)}_{p_{m^{(t+1)}}} \end{bmatrix}= \theta^{(t+1)} $ To solve the issue, to jump from $(\theta ,m)$ to $(\theta',m')$, RJMCMC pads $\theta$ and $\theta'$ with auxiliary variables $U,U'$ so that their dimensions are comparable, then makes the jump with **proposal function** $\phi_{m,m'}$: $\theta', u':=\phi_{m,m'}(\theta,u),$ and we require the family $\{ \phi_{m,m'}\}$ to be **involutions**, in the sense that $\phi_{m',m}(\theta',u')=\theta,u$ holds at the same time, i.e. mapping twice will recover the original model's parameters. From now on we drop the subscript and write $\phi$. - This involution can be defined with choosing the map's first component $\theta':=\phi_{1}(\theta,u)$, then requiring $u'$ to be the solution to $\theta=\phi_{1}(\theta', u')$. As long as our choice of $\phi_{1}$ is "invertible" in this sense, the defines the relationship $\phi_{2}(\theta,u):= u'$, and we can write $\phi:= (\phi_{1},\phi_{2})$. - For example, if $\phi_{1}(\theta,u)=\theta+u$, then $u'=\phi_{2}(\theta,u)=-u$. So given a proposed jump $m\to m'$, we have $\theta=\begin{bmatrix} \theta_{1} \\ \vdots \\ \theta_{p_{m}} \end{bmatrix} \xrightarrow{\text{sample }U} \begin{array}{} \begin{bmatrix} \theta_{1} \\ \vdots \\ \theta_{p_{m}} \end{bmatrix} \\ \begin{bmatrix} U_{1} \\ \vdots \\ \vdots \\ U_{\tilde{p}_{m}} \end{bmatrix} \end{array} \xleftrightarrow[\phi_{m',m}]{~~~\phi_{m,m'}~~}\begin{array}{} \begin{bmatrix} \theta'_{1} \\ \vdots \\ \vdots \\ \theta'_{p_{m'}} \end{bmatrix} \\ \begin{bmatrix} U'_{1} \\ \vdots \\ U'_{\tilde{p}_{m}} \end{bmatrix} \end{array} \xrightarrow{\text{drop }U'}\begin{bmatrix} \theta'_{1} \\ \vdots \\ \vdots \\ \theta'_{p_{m'}} \end{bmatrix}=\theta',$ where $U,U'$ have dimensions $\tilde{p}_{m},\tilde{p}_{m'}$ respectively, so that the padded dimensions are equal on both sides, with $p_{m}+\tilde{p}_{m}=p_{m'}+\tilde{p}_{m'}$. Therefore, > [!definition|*] RJMCMC Proposal > The RJMCMC uses an implicit proposal given by $\begin{align*} m' &\sim q_{M}(m' ~|~ m)\\ U &\hphantom{:}\sim q_{U}(u ~|~ m,m', \theta),\\ \theta', U' &:= \phi_{m,m'}(\theta, U), \end{align*}$for some proposal distributions $q_{M},q_{U}$ and involution $\phi_{m,m'}$ of our choice. We the acceptance probability derived for [[Metropolis Hastings#Proposals with Transformations|M-H proposal with transformations]], which is (without the jump $m\to m'$, just $(\theta,u)\to (\theta',u')$) $\alpha(\theta',u'~|~\theta,u)=\min \left\{ 1, \frac{\pi(\theta')q_{U}(u')}{\pi(\theta)q_{U}(u)}\cdot \left| \det\frac{ \partial (\theta',u') }{ \partial (\theta,u) } \right| \right\}.$ Now incorporating the jump between models, we have > [!definition|*] RJMCMC Acceptance Probability > $\alpha(m',\theta',u'~|~m,\theta,u)=\min \left\{ 1, \frac{\pi(m',\theta')q_{M}(m ~|~ m')q_{U}(u')}{\pi(m,\theta)q_{M}(m' ~|~ m) q_{U}(u)}\cdot \left|\det \frac{ \partial (\theta',u') }{ \partial (\theta,u) } \right| \right\}.$ ### Example: Variable Selection Let $m$ are indices for feature selection, and $\theta$ contains the coefficients of those features included in $m$. For simplicity, let our proposal be: - Transition between models $m,m'$ by uniformly randomly adding and dropping features one-at-a-time, so out of $p$ features, $q_{M}(m ~|~ m')= \begin{cases} \frac{1}{2| m |} & \text{if droping feature} \\ \frac{1}{2(p-| m |)} & \text{if adding feature} \end{cases}$ - When adding a feature $j$, its coefficient $\theta_{j}'$ is proposed by $\theta'_{j}=g(\theta, u)$, so that the proposal function $\phi$ has $\phi_{1}(\theta, u)= \theta \cup g(\theta,u).$ - To ensure $\phi$ is an involution (in particular invertible), we need $\dim (\theta, u)=\dim (\theta',u')$, which we WLOG assume $\dim(u)=1$ (or what ever dimension $\theta'_{j}$ is), and $u' = \varnothing$. - Therefore, we have $\phi(\theta, u)=(\theta', \varnothing)$, yielding the Jacobian (WLOG letting $\theta'_{j}$ be the last entry of $\theta'$) $\begin{align*} \text{Adding variable: }~J(\theta, u)&= \left| \det\begin{pmatrix} \dfrac{ \partial \theta' }{ \partial \theta }& \dfrac{ \partial \theta' }{ \partial u } \end{pmatrix}\right |=\left|\det \begin{pmatrix} 1 \\ & \ddots \\ & & 1 \\ \leftarrow & \frac{ \partial \theta'_{j} }{ \partial \theta } & \rightarrow & \frac{ \partial \theta'_{j} }{ \partial u } \end{pmatrix}\right |=\left| \frac{ \partial g(\theta, u) }{ \partial u } \right |.\\ \text{Deleting variable: }~J(\theta, \varnothing)&= \left| \begin{pmatrix} \dfrac{ \partial \theta' }{ \partial \theta } \\ \\ \dfrac{ \partial u' }{ \partial \theta } \end{pmatrix} \right|= \left|\det \begin{pmatrix} 1 & & & \uparrow \\ & \ddots & & \frac{ \partial \theta'_{j} }{ \partial \theta } \\ & & 1 & \downarrow \\ & & & \frac{ \partial u' }{ \partial \theta_{j} } \end{pmatrix}\right |= \left| \frac{ \partial g(\theta, u) }{ \partial u } \right |^{-1}. \end{align*}$ Then we obtain the proposal kernel $\begin{align*} \text{Adding variable: }~q(m', \theta', \varnothing ~|~ m, \theta, u)&= \frac{1}{2(p-| m |)}q_{U}(du) \\ \text{Deleting variable: }~q(m', \theta', u' ~|~ m, \theta, \varnothing)&= \frac{1}{2| m |} \end{align*}$ acceptance probability of $\begin{align*} \text{Adding: }~\alpha(m', \theta', \varnothing ~|~ m, \theta, u)&= \min\left( 1, \frac{p(\theta', m' ~|~ y)}{p(\theta, m ~|~ y)}\cdot \frac{p-| m |}{| m'|}\cdot\frac{1}{q_{U}(u)}\cdot | \frac{ \partial g(\theta, u) }{ \partial u } | \right)\\[0.8em] \text{Deleting: }~ \alpha(m', \theta', u' ~|~ m, \theta, \varnothing)&= \min\left( 1, \frac{p(\theta', m' ~|~ y)}{p(\theta, m ~|~ y)}\cdot \frac{| m |}{p-| m' |}\cdot {q_{U}(u')}\cdot | \frac{ \partial g(\theta, u) }{ \partial u } |^{-1} \right) \end{align*}$