Continuous Time Markov Chains - Random Notes Go Brrrrrrr

This note deals with Markov chains $(X_{t})_{t \ge 0}$ taking values over a countable state space $\mathbb{S}$ and continuous time $t \in \mathbb{R}^{+}$, also known as **continuous time Markov chains (CTMC)**. It is a special case of a **right-continuous process**, meaning that $X_{t}$ must remain constant for a period of time. Recording the time when $(X_{t})$ changes value gives the **jump times** $(T_{n})_{n \in \mathbb{N}}$, i.e. the time $t$ when $(X_{t})$ changes value for the $n$th time: $T_{0}:=0, \,\,T_{n}:= \inf \{ t > T_{n-1} \,|\, X_{t} \ne X_{T_{n-1}} \}.$ The amount of time $X_{t}$ waited before jumping is the **holding time** $Z_{n}:= T_{n}-T_{n-1}.$ The values it takes at the jumps form its **jump process/chain**$(Y_{n})_{n \in \mathbb{N}}:= (X_{T_{n}})_{n \in \mathbb{N}}.$Note that a right-continuous process has countably many jumps, so this sequence can be enumerated over $\mathbb{N}$. There are three types of such processes, where the first two do not explode: | ![[CTMCcase1.png#invert]] | ![[CTMCcase2.png#invert]] | | ---- | ---- | | The process jumps around for infinitely many times. | The process eventually settles down at a constant state. | The process can also make infinitely many jumps in a finite time span: | ![[CTMCcase3.png#invert\|w60\|center]] | | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | The process **explodes** to infinity in finite time, given by $T_{\infty}:= \lim_{n \to \infty}T_{n}$. In this case it "restarts" back to a finite value; instead, we can modify the state space to be $\mathbb{S} \cup \{ \infty \}$ and set $(X_{t})_{t \ge T_{\infty}}\equiv \infty$ constant to create a **minimal** Markov chain. | ## Matrix Representations Just like the discrete time version, the CTMC $(X_{t})$ can be described by its initial state $\pi$ and transitions. However, the main difference is that *the CTMC transitions need to be characterized by both their probabilities $\mathbb{P}[X_{t+h}=j \,|\,X_{t}=i]$ and their holding times $Z_{n} \,|\, \{ X_{T_{n-1}}=i \}$.* ### CTMC as Competing Exponential Jumps The setup of [[Properties of the Exponential Distribution#Competing Exponentials|competing exponentials]] is perfect for modeling both jump probabilities and times: > [!idea] Competing exponentials for jumps > After $X_{t}$ jumps to $i$ at $T_{n}$, a series of clocks indexed by $j \in \mathbb{S}\backslash \{ i \}$ start to tick, and *the first clock to ring decides both the time and destination of the next jump.* > > The time it takes each clock to ring is $\mathrm{Exp}(q_{ij})$ for some rate-constant $q_{ij}$, and as soon as the $j$th clock rings, $X_{t}$ jumps to $j$. By defining the "total ticking rate" $q_{i}:= \sum_{j \in \mathbb{S} \backslash \{ i \}}q_{ij}$, we see that *the competing exponential characterization defines both the time and probability of transitions.* - The holding time $Z_{n} \,|\, \{ X_{T_{n-1}}=i \}$ is the infimum of a family of exponentials, which follows $\mathrm{Exp}\left( \sum_{j \ne i}q_{ij} \right)=\mathrm{Exp}(q_{i})$. - The transition probability $\mathbb{P}[X_{T_{n}}=j \,|\, X_{T_{n-1}}=i]$ equals the probability that the $j$th clock is the infimum, equalling $q_{ij} / q_{i}$. - Note that the jump $i \to i$ is not permitted (unless $i$ is absorbing, for which case the transition rates are $0$, and none of the clocks are ticking). These values can be collected into matrices: > [!definition|*] Rate matrix > The **generator**, or **rate matrix** $Q$ is contains the raw ticking rates: $Q:=　\begin{array}{} & \begin{array}{} j_{1}\,\,\,\,&j_{2}\,\,\,\,& \cdots\,\,\,\,\, & i\,\, & \cdots \\ \end{array} \\ \begin{array}{} j_{1} \\ \vdots \\ i \\ \vdots\end{array} & \left(\begin{array}{} \\ \\ \\ q_{ij_{1}} & q_{ij_{2}} & \cdots & -q_{i} & \cdots \\ \\ \\ \end{array}\right) \end{array}$ > >That is, the $i$th row of $Q$ contains: >- For $j \ne i$, $q_{ij}$ is the rate going from state $i$ to state $j$, >- The diagonal entry $q_{ii}$ is the negative of the summed rates $q_{i}=\sum_{j \ne i}q_{ij}$. Obvious properties include: - Every row sums to $0$: i.e. $Q\mathbf{1}=Q(1,\dots,1)^{T}=(0,\dots,0)^{T}$. - Diagonal entries are all negative, while off-diagonal ones are non-negative. In fact, any matrix that satisfies this property is called a **Q-matrix**. The transition probabilities are encoded in the jump matrix: > [!definition|*] Jump matrix > The **jump matrix** $\Pi=(\pi_{ij})_{i,j \in \mathbb{S}}$ of a CTMC records the transition probabilities $\mathbb{P}[X_{T_{n}}=j \,|\, X_{T_{n-1}}=i]$: $\pi_{ij}=\begin{cases} {q_{ij}} / {q_{i}} &\text{if }j \ne i \\ 0 &\text{if }j=i \end{cases}$In the special case of an absorbing state $i$ with $q_{i}=0$, define $\pi_{ij}:= \delta_{ij}.$ - With those definitions, the CTMC stays in state $i$ for a time of $Z \sim \mathrm{Exp}(q_{i})$, and then jumps to state $j$ with probability $\pi_{ij}$. The competing exponentials model motivates the more formal definition: > [!definition|*] CTMC as Competing Exponentials > Inductively construct a stochastic process $(X_{t})$ and its jump chain $(Y_{n})$. Start with $X_{0}=Y_{0} \sim \nu$. > > For $n=0,\dots$, condition on $Y_{n}=i$, > - Let $\{ S_{n,j} \sim \mathrm{Exp}(q_{ij}) \,|\, j \in \mathbb{S}, j \ne i\}$ be the set of "clocks" that start ticking after the $n$th jump. > - The holding time is $Z_{n}:= \inf_{j}S_{n,j}$, i.e. the time it takes for a clock to ring. > - The destination is $Y_{n+1}= \underset{j \in \mathbb{S}, j \ne i}{\arg\inf}\, S_{n,j}$, i.e. the clock to ring first. > - If the the rates are all $0$, instead set $Z_{n}=\infty$, $Y_{n+1}=Y_{n}$ (a phantom jump). ### CTMC from DTMC and Holding Time Alternatively, we can tackle the time and probability separately: - The probability is modeled with a DTMC, and - The time is modeled as an exponential holding time. In fact, the jump matrix works exactly like the transition matrix in discrete Markov chains: *CTMC without time is DTMC*. > [!idea] CTMC constructed with a DTMC and exponential holding times > Given a generator matrix $Q$ and initial state $\nu$, use the derived jump matrix $\Pi$ as a transition matrix to define a DTMC $(Y_{n})_{n \in \mathbb{N}} \sim \mathrm{Markov}(\Pi, \nu)$. > > Then compute the time dimension separately: > - Take holding times $Z_{n}\sim \mathrm{Exp}(q_{Y_{n}})$ for $i=1,\dots$. > - Compute the jump times $T_{n}:= \sum_{i=1}^{n}Z_{i}$. > > Reintroduce time by defining the process $(X_{t})_{t \in \mathbb{R}^{+}}$, given by $X_{t}=\begin{cases} Y_{n} &\text{if }T_{n} \le t < T_{n+1}\text{ for some }n, \\ \infty &\text{otherwise} \end{cases}$ This process is a minimal CTMC, and $Y_{n}$ is called its **jump process/chain**. We can then use this construction in the other direction, arriving at the jump chain - holding time definition of CTMC: > [!definition|*] Continuous Time Markov Chain > A minimal, right-continuous stochastic process $(X_{t})_{t \ge 0}$ is a **continuous time Markov chain (CTMC)** if for some generator matrix $Q$ and initial distribution $\nu$: > - Its jump process $(Y_{n})_{n \in \mathbb{N}}$ is a DTMC $\mathrm{Markov}(\Pi, \nu)$, where $\Pi$ is the jump matrix derived from $Q$. > - Its holding times $Z_{n}$, conditioned on $(Y_{n})$, are independent $\mathrm{Exp}(q_{Y_{n-1}})$ respectively. ### Examples of Matrix Representations > [!examples] Rate matrix, jump matrix and graph of a CTMC > | $Q=\left(\begin{array}{c c c}{{-2}}&{{1}}&{{1}}\\ {{1}}&{{-1}}&{{0}}\\ {{2}}&{{1}}&{{-3}}\end{array}\right)$ | $\Pi=\left(\begin{array}{c c c}{{0}}&{{\frac{1}{2}}}&{{\frac{1}{2}}}\\ {{1}}&{{0}}&{{0}}\\ {{\frac{2}{3}}}&{{\frac{1}{3}}}&{{0}}\end{array}\right)$ | ![[CTMC1.png#invert\|center\|w50]] | | ---- | ---- | ---- | > [!examples] Matrices of simple birth processes > The Poisson process with rate $\lambda$ has rate matrix with $-\lambda$ on the main diagonal and $+\lambda$ on the one above it. That is, $q_{ij}= \begin{cases} -\lambda & \text{if }i=j \\ +\lambda & \text{if }i = j - 1 \\ 0 & \text{otherwise} \end{cases}$ > > The Yule process with birth rate $\lambda$ is similar, but with rate scaling with the population $i$: the rate matrix is$q_{ij}= \begin{cases} -i\lambda & \text{if }i=j \\ +i \lambda & \text{if } i = j - 1 \\ 0 & \text{otherwise} \end{cases}$ For simple birth processes like above (i.e. processes with no deaths or simultaneous births), the jump matrix is always trivial: filled with $0$ except for $\pi_{i, i+1}=1$. ## Markov Properties The CTMC has the **Markov property** that given $X_t=k$, the past $(X_{t})_{t <k}$ is independent of the future $(X_{t+s})_{s \ge 0}$. Furthermore, the future $(X_{t+s})_{s \ge 0}$ is another CTMC with the same rate matrix $Q$. The **strong Markov property** states that the same independence for any [[Filtrations and Stopping Times#Stopping Time and Stopped Processes|stopping time]] $T$: for any $k \in \mathbb{S}$ such that $\mathbb{P}[X_{T}=k] \ne 0$, conditioning on $\{ X_{T}=k \}$ gives: $\begin{align*} (X_{s})_{s < T} &\perp (X_{T+s})_{s \ge 0},\\[0.4em] (X_{T+s})_{s \ge 0} &\sim \mathrm{Markov}(Q, \delta_{k}). \end{align*}$ - Note that it's not necessary to condition on the value of $T$. ## Transition Matrices For a CTMC, its transition probabilities between $i,j \in \mathbb{S}$ over a timespan of $s$ is $p_{ij}(s)=\mathbb{P}[X_{t+s}=j \,|\, X_{t}=i],$which is independent of the choice of $t$ by the Markov property. > [!idea] > Moving forward in time is done by left-applying the matrix $P(t)$ to the initial distribution $\nu$: $\mathbb{P}[X_{t}=i]=(P(t)\nu)_{i}.$ >This is analogous to the DTMC version $\mathbb{P}[X_{n}=i]=P^{n}\nu$. Arranging this into a matrix $P(t)=(p_{ij}(t))_{i,j \in \mathbb{S}}$ populates the **transition semigroup** of the CTMC, $\{ P(t) \,|\, t \ge 0 \}$. It has the properties: - $P(0)=I$, - $P(t+s)=P(t)P(s)$ under matrix multiplication. ### Backward and Forward Equations > [!idea] The CTMC's transition matrix $P(t) = e^{tQ}=(e^{Q})^{t}$ is the continuous equivalent of the DTMC $X_{n} \sim P^{n}\nu$. > [!theorem|*] Backward Equation > If $(X_{t}) \sim \mathrm{Markov}(Q, \nu)$ is a CTMC, then its transition matrices $P(t)$ satisfy the **backward equation**: $P'(t):= (p_{ij}'(t))_{i,j \in \mathbb{S}}=QP(t).$Furthermore, given initial condition $P(0)=I$, any other solution $\tilde{P}$ to the equation must be element-wise greater than or equal to $P$. > > > [!proof]- > > Prove the equation for the element $p_{ij}(t)$. Condition on the time of the first jump, $T_{1}$, and its destination $k$: $\begin{align*} > p_{ij}&= \mathbb{P}_{i}[X_{t}=j ~~\&~~ T_{1} > t] + \mathbb{P}_{i}[X_{t}=j ~~ \& ~~ T_{1} \le t] \\ > &= \delta_{ij} + \int _{0}^{t}\mathbb{P}_{i}[X_{t}=j \,|\, T_{1}=s] \cdot q_{i}e^{-q_{i}s} \, ds\\[0.8em] > & \begin{split} =\delta_{ij}+ \sum_{k \ne i} \big\{ \mathbb{P}_{i}&[X_{t}=j \,|\, X_{T_{1}}=k, T_{1}=s] \cdot \\ &\mathbb{P}_{i}[X_{T_{1}}=k \,|\, T_{1}=s] \big\}\cdot q_{i}e^{-q_{i}s} \, ds > \end{split} \\[0.8em] > &= \delta_{ij} + \int _{0}^{t} \sum_{k \ne i} p_{kj}(t-s) \cdot \pi_{ik} \cdot q_{i}e^{-q_{i}s} \, ds\\ > &= \delta_{ij} + \int _{0}^{t} \sum_{k \ne i} p_{kj}(u) \pi_{ik} q_{i}e^{-q_{i}(t - u)} \, du & [u=t-s] > \end{align*}$ > Multiplying both sides by $\exp(q_{i}t)$ gives $p_{ij}e^{q_{i}t}=\delta_{ij}e^{q_{i}t} + \int _{0}^{t} \sum_{k \ne i} p_{kj}(u) \pi_{ik} q_{i}e^{q_{i} u} \, du,$and taking the derivatives gives $\begin{align*} > (q_{i}p_{ij} + p_{ij}')\cancel{e^{q_{i}t}}&= q_{i}\delta_{ij}\cancel{e^{q_{i}t}} + \sum_{k \ne i} p_{kj}(t) \underbrace{\pi_{ik} q_{i}}_{=q_{ik}}\cdot\cancel{e^{q_{i}t}}\\ > p'_{ij}&=\delta_{ij}q_{i} + \sum_{k}p_{kj}(t) q_{ik} > \end{align*},$where the $q_{i}p_{ij}=-q_{ii}p_{ij}$ term is included in the sum. This is the element-wise form of the backward equation. - The backward equation has a unique solution if $\sum_{j \in \mathbb{S}}p_{ij}=1$ for any $i \in \mathbb{S}$, so any non-uniqueness only arises in exploding CTMC. > [!theorem|*] Forward Equation > Similarly, if we assume $(X_{t})$ to be minimal then it also satisfy the forward equation $P'(t)=P(t)Q,$with minimality under the initial condition $I(0)$. > > [!proof] Proof for finite state space > > Recall that if $i \ne j$, $P_{ij}(h)=(q_{i}h +o(h))\pi_{ij}=q_{ij}h +o(h)$. Therefore conditioning on $X_{t}$, $\begin{align*} P_{ij}(t+h)&= P_{ij}(t)P_{jj}(h)+\sum_{k \ne j}P_{ik}(t) P_{kj}(h)\\ &= P_{ij}(t)(1-q_{ij}h +o(h) ) + \sum_{k \ne j} P_{ik}(t)(q_{kj}h + o(h))\\ &= P_{ij}(t) +h\left( \sum_{k \in \mathbb{S}} P_{ik}(t)q_{kj} \right)+o(h)\\ P_{ij}(t+h)-P_{ij}(t)&= h(PQ)_{ij}+o(h), \end{align*}$and dividing by $h$ then letting $h \to 0$ gives the forward equation. > > > > In the third equality, $o(h)=\sum_{k} P_{k}(t)\cdot o(h)$ is only justified in finite state spaces. These equations motivates the identity $P(t)=e^{tQ}=(e^{Q})^{t},$which fills in the gaps between the discrete Markov chain $(I, P, P^{2},\dots)=(P(0), P(1),P(2),\dots)$. A few properties follow: - $P^{(k)}=Q^{k}$, where the derivative is again element-wise.