This note deals with Markov chains $(X_{t})_{t \ge 0}$ taking values over a countable state space $\mathbb{S}$ and continuous time $t \in \mathbb{R}^{+}$, also known as **continuous time Markov chains (CTMC)**.
It is a special case of a **right-continuous process**, meaning that $X_{t}$ must remain constant for a period of time.
Recording the time when $(X_{t})$ changes value gives the **jump times** $(T_{n})_{n \in \mathbb{N}}$, i.e. the time $t$ when $(X_{t})$ changes value for the $n$th time: $T_{0}:=0, \,\,T_{n}:= \inf \{ t > T_{n-1} \,|\, X_{t} \ne X_{T_{n-1}} \}.$
The amount of time $X_{t}$ waited before jumping is the **holding time** $Z_{n}:= T_{n}-T_{n-1}.$
The values it takes at the jumps form its **jump process/chain**$(Y_{n})_{n \in \mathbb{N}}:= (X_{T_{n}})_{n \in \mathbb{N}}.$Note that a right-continuous process has countably many jumps, so this sequence can be enumerated over $\mathbb{N}$.
There are three types of such processes, where the first two do not explode:
| ![[CTMCcase1.png#invert]] | ![[CTMCcase2.png#invert]] |
| ---- | ---- |
| The process jumps around for infinitely many times. | The process eventually settles down at a constant state. |
The process can also make infinitely many jumps in a finite time span:
| ![[CTMCcase3.png#invert\|w60\|center]] |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| The process **explodes** to infinity in finite time, given by $T_{\infty}:= \lim_{n \to \infty}T_{n}$. In this case it "restarts" back to a finite value; instead, we can modify the state space to be $\mathbb{S} \cup \{ \infty \}$ and set $(X_{t})_{t \ge T_{\infty}}\equiv \infty$ constant to create a **minimal** Markov chain. |
## Matrix Representations
Just like the discrete time version, the CTMC $(X_{t})$ can be described by its initial state $\pi$ and transitions.
However, the main difference is that *the CTMC transitions need to be characterized by both their probabilities $\mathbb{P}[X_{t+h}=j \,|\,X_{t}=i]$ and their holding times $Z_{n} \,|\, \{ X_{T_{n-1}}=i \}$.*
### CTMC as Competing Exponential Jumps
The setup of [[Properties of the Exponential Distribution#Competing Exponentials|competing exponentials]] is perfect for modeling both jump probabilities and times:
> [!idea] Competing exponentials for jumps
> After $X_{t}$ jumps to $i$ at $T_{n}$, a series of clocks indexed by $j \in \mathbb{S}\backslash \{ i \}$ start to tick, and *the first clock to ring decides both the time and destination of the next jump.*
>
> The time it takes each clock to ring is $\mathrm{Exp}(q_{ij})$ for some rate-constant $q_{ij}$, and as soon as the $j$th clock rings, $X_{t}$ jumps to $j$.
By defining the "total ticking rate" $q_{i}:= \sum_{j \in \mathbb{S} \backslash \{ i \}}q_{ij}$, we see that *the competing exponential characterization defines both the time and probability of transitions.*
- The holding time $Z_{n} \,|\, \{ X_{T_{n-1}}=i \}$ is the infimum of a family of exponentials, which follows $\mathrm{Exp}\left( \sum_{j \ne i}q_{ij} \right)=\mathrm{Exp}(q_{i})$.
- The transition probability $\mathbb{P}[X_{T_{n}}=j \,|\, X_{T_{n-1}}=i]$ equals the probability that the $j$th clock is the infimum, equalling $q_{ij} / q_{i}$.
- Note that the jump $i \to i$ is not permitted (unless $i$ is absorbing, for which case the transition rates are $0$, and none of the clocks are ticking).
These values can be collected into matrices:
> [!definition|*] Rate matrix
> The **generator**, or **rate matrix** $Q$ is contains the raw ticking rates: $Q:= \begin{array}{} &
\begin{array}{}
j_{1}\,\,\,\,&j_{2}\,\,\,\,& \cdots\,\,\,\,\, & i\,\, & \cdots \\
\end{array} \\
\begin{array}{} j_{1} \\ \vdots \\ i \\ \vdots\end{array} &
\left(\begin{array}{}
\\ \\ \\
q_{ij_{1}} & q_{ij_{2}} & \cdots & -q_{i} & \cdots \\ \\ \\
\end{array}\right)
\end{array}$
>
>That is, the $i$th row of $Q$ contains:
>- For $j \ne i$, $q_{ij}$ is the rate going from state $i$ to state $j$,
>- The diagonal entry $q_{ii}$ is the negative of the summed rates $q_{i}=\sum_{j \ne i}q_{ij}$.
Obvious properties include:
- Every row sums to $0$: i.e. $Q\mathbf{1}=Q(1,\dots,1)^{T}=(0,\dots,0)^{T}$.
- Diagonal entries are all negative, while off-diagonal ones are non-negative.
In fact, any matrix that satisfies this property is called a **Q-matrix**.
The transition probabilities are encoded in the jump matrix:
> [!definition|*] Jump matrix
> The **jump matrix** $\Pi=(\pi_{ij})_{i,j \in \mathbb{S}}$ of a CTMC records the transition probabilities $\mathbb{P}[X_{T_{n}}=j \,|\, X_{T_{n-1}}=i]$: $\pi_{ij}=\begin{cases}
{q_{ij}} / {q_{i}} &\text{if }j \ne i \\
0 &\text{if }j=i
\end{cases}$In the special case of an absorbing state $i$ with $q_{i}=0$, define $\pi_{ij}:= \delta_{ij}.$
- With those definitions, the CTMC stays in state $i$ for a time of $Z \sim \mathrm{Exp}(q_{i})$, and then jumps to state $j$ with probability $\pi_{ij}$.
The competing exponentials model motivates the more formal definition:
> [!definition|*] CTMC as Competing Exponentials
> Inductively construct a stochastic process $(X_{t})$ and its jump chain $(Y_{n})$. Start with $X_{0}=Y_{0} \sim \nu$.
>
> For $n=0,\dots$, condition on $Y_{n}=i$,
> - Let $\{ S_{n,j} \sim \mathrm{Exp}(q_{ij}) \,|\, j \in \mathbb{S}, j \ne i\}$ be the set of "clocks" that start ticking after the $n$th jump.
> - The holding time is $Z_{n}:= \inf_{j}S_{n,j}$, i.e. the time it takes for a clock to ring.
> - The destination is $Y_{n+1}= \underset{j \in \mathbb{S}, j \ne i}{\arg\inf}\, S_{n,j}$, i.e. the clock to ring first.
> - If the the rates are all $0$, instead set $Z_{n}=\infty$, $Y_{n+1}=Y_{n}$ (a phantom jump).
### CTMC from DTMC and Holding Time
Alternatively, we can tackle the time and probability separately:
- The probability is modeled with a DTMC, and
- The time is modeled as an exponential holding time.
In fact, the jump matrix works exactly like the transition matrix in discrete Markov chains: *CTMC without time is DTMC*.
> [!idea] CTMC constructed with a DTMC and exponential holding times
> Given a generator matrix $Q$ and initial state $\nu$, use the derived jump matrix $\Pi$ as a transition matrix to define a DTMC $(Y_{n})_{n \in \mathbb{N}} \sim \mathrm{Markov}(\Pi, \nu)$.
>
> Then compute the time dimension separately:
> - Take holding times $Z_{n}\sim \mathrm{Exp}(q_{Y_{n}})$ for $i=1,\dots$.
> - Compute the jump times $T_{n}:= \sum_{i=1}^{n}Z_{i}$.
>
> Reintroduce time by defining the process $(X_{t})_{t \in \mathbb{R}^{+}}$, given by $X_{t}=\begin{cases}
Y_{n} &\text{if }T_{n} \le t < T_{n+1}\text{ for some }n, \\
\infty &\text{otherwise}
\end{cases}$
This process is a minimal CTMC, and $Y_{n}$ is called its **jump process/chain**.
We can then use this construction in the other direction, arriving at the jump chain - holding time definition of CTMC:
> [!definition|*] Continuous Time Markov Chain
> A minimal, right-continuous stochastic process $(X_{t})_{t \ge 0}$ is a **continuous time Markov chain (CTMC)** if for some generator matrix $Q$ and initial distribution $\nu$:
> - Its jump process $(Y_{n})_{n \in \mathbb{N}}$ is a DTMC $\mathrm{Markov}(\Pi, \nu)$, where $\Pi$ is the jump matrix derived from $Q$.
> - Its holding times $Z_{n}$, conditioned on $(Y_{n})$, are independent $\mathrm{Exp}(q_{Y_{n-1}})$ respectively.
### Examples of Matrix Representations
> [!examples] Rate matrix, jump matrix and graph of a CTMC
>
| $Q=\left(\begin{array}{c c c}{{-2}}&{{1}}&{{1}}\\ {{1}}&{{-1}}&{{0}}\\ {{2}}&{{1}}&{{-3}}\end{array}\right)$ | $\Pi=\left(\begin{array}{c c c}{{0}}&{{\frac{1}{2}}}&{{\frac{1}{2}}}\\ {{1}}&{{0}}&{{0}}\\ {{\frac{2}{3}}}&{{\frac{1}{3}}}&{{0}}\end{array}\right)$ | ![[CTMC1.png#invert\|center\|w50]] |
| ---- | ---- | ---- |
> [!examples] Matrices of simple birth processes
> The Poisson process with rate $\lambda$ has rate matrix with $-\lambda$ on the main diagonal and $+\lambda$ on the one above it. That is, $q_{ij}= \begin{cases}
-\lambda & \text{if }i=j \\
+\lambda & \text{if }i = j - 1 \\
0 & \text{otherwise}
\end{cases}$
>
> The Yule process with birth rate $\lambda$ is similar, but with rate scaling with the population $i$: the rate matrix is$q_{ij}= \begin{cases}
-i\lambda & \text{if }i=j \\
+i \lambda & \text{if } i = j - 1 \\
0 & \text{otherwise}
\end{cases}$
For simple birth processes like above (i.e. processes with no deaths or simultaneous births), the jump matrix is always trivial: filled with $0$ except for $\pi_{i, i+1}=1$.
## Markov Properties
The CTMC has the **Markov property** that given $X_t=k$, the past $(X_{t})_{t <k}$ is independent of the future $(X_{t+s})_{s \ge 0}$. Furthermore, the future $(X_{t+s})_{s \ge 0}$ is another CTMC with the same rate matrix $Q$.
The **strong Markov property** states that the same independence for any [[Filtrations and Stopping Times#Stopping Time and Stopped Processes|stopping time]] $T$: for any $k \in \mathbb{S}$ such that $\mathbb{P}[X_{T}=k] \ne 0$, conditioning on $\{ X_{T}=k \}$ gives: $\begin{align*}
(X_{s})_{s < T} &\perp (X_{T+s})_{s \ge 0},\\[0.4em]
(X_{T+s})_{s \ge 0} &\sim \mathrm{Markov}(Q, \delta_{k}).
\end{align*}$
- Note that it's not necessary to condition on the value of $T$.
## Transition Matrices
For a CTMC, its transition probabilities between $i,j \in \mathbb{S}$ over a timespan of $s$ is $p_{ij}(s)=\mathbb{P}[X_{t+s}=j \,|\, X_{t}=i],$which is independent of the choice of $t$ by the Markov property.
> [!idea]
> Moving forward in time is done by left-applying the matrix $P(t)$ to the initial distribution $\nu$: $\mathbb{P}[X_{t}=i]=(P(t)\nu)_{i}.$
>This is analogous to the DTMC version $\mathbb{P}[X_{n}=i]=P^{n}\nu$.
Arranging this into a matrix $P(t)=(p_{ij}(t))_{i,j \in \mathbb{S}}$ populates the **transition semigroup** of the CTMC, $\{ P(t) \,|\, t \ge 0 \}$. It has the properties:
- $P(0)=I$,
- $P(t+s)=P(t)P(s)$ under matrix multiplication.
### Backward and Forward Equations
> [!idea] The CTMC's transition matrix $P(t) = e^{tQ}=(e^{Q})^{t}$ is the continuous equivalent of the DTMC $X_{n} \sim P^{n}\nu$.
> [!theorem|*] Backward Equation
> If $(X_{t}) \sim \mathrm{Markov}(Q, \nu)$ is a CTMC, then its transition matrices $P(t)$ satisfy the **backward equation**: $P'(t):= (p_{ij}'(t))_{i,j \in \mathbb{S}}=QP(t).$Furthermore, given initial condition $P(0)=I$, any other solution $\tilde{P}$ to the equation must be element-wise greater than or equal to $P$.
>
> > [!proof]-
> > Prove the equation for the element $p_{ij}(t)$. Condition on the time of the first jump, $T_{1}$, and its destination $k$: $\begin{align*}
> p_{ij}&= \mathbb{P}_{i}[X_{t}=j ~~\&~~ T_{1} > t] + \mathbb{P}_{i}[X_{t}=j ~~ \& ~~ T_{1} \le t] \\
> &= \delta_{ij} + \int _{0}^{t}\mathbb{P}_{i}[X_{t}=j \,|\, T_{1}=s] \cdot q_{i}e^{-q_{i}s} \, ds\\[0.8em]
> & \begin{split} =\delta_{ij}+ \sum_{k \ne i} \big\{ \mathbb{P}_{i}&[X_{t}=j \,|\, X_{T_{1}}=k, T_{1}=s] \cdot \\ &\mathbb{P}_{i}[X_{T_{1}}=k \,|\, T_{1}=s] \big\}\cdot q_{i}e^{-q_{i}s} \, ds
> \end{split} \\[0.8em]
> &= \delta_{ij} + \int _{0}^{t} \sum_{k \ne i} p_{kj}(t-s) \cdot \pi_{ik} \cdot q_{i}e^{-q_{i}s} \, ds\\
> &= \delta_{ij} + \int _{0}^{t} \sum_{k \ne i} p_{kj}(u) \pi_{ik} q_{i}e^{-q_{i}(t - u)} \, du & [u=t-s]
> \end{align*}$
> Multiplying both sides by $\exp(q_{i}t)$ gives $p_{ij}e^{q_{i}t}=\delta_{ij}e^{q_{i}t} + \int _{0}^{t} \sum_{k \ne i} p_{kj}(u) \pi_{ik} q_{i}e^{q_{i} u} \, du,$and taking the derivatives gives $\begin{align*}
> (q_{i}p_{ij} + p_{ij}')\cancel{e^{q_{i}t}}&= q_{i}\delta_{ij}\cancel{e^{q_{i}t}} + \sum_{k \ne i} p_{kj}(t) \underbrace{\pi_{ik} q_{i}}_{=q_{ik}}\cdot\cancel{e^{q_{i}t}}\\
> p'_{ij}&=\delta_{ij}q_{i} + \sum_{k}p_{kj}(t) q_{ik}
> \end{align*},$where the $q_{i}p_{ij}=-q_{ii}p_{ij}$ term is included in the sum. This is the element-wise form of the backward equation.
- The backward equation has a unique solution if $\sum_{j \in \mathbb{S}}p_{ij}=1$ for any $i \in \mathbb{S}$, so any non-uniqueness only arises in exploding CTMC.
> [!theorem|*] Forward Equation
> Similarly, if we assume $(X_{t})$ to be minimal then it also satisfy the forward equation $P'(t)=P(t)Q,$with minimality under the initial condition $I(0)$.
> > [!proof] Proof for finite state space
> > Recall that if $i \ne j$, $P_{ij}(h)=(q_{i}h +o(h))\pi_{ij}=q_{ij}h +o(h)$. Therefore conditioning on $X_{t}$, $\begin{align*}
P_{ij}(t+h)&= P_{ij}(t)P_{jj}(h)+\sum_{k \ne j}P_{ik}(t) P_{kj}(h)\\
&= P_{ij}(t)(1-q_{ij}h +o(h) ) + \sum_{k \ne j} P_{ik}(t)(q_{kj}h + o(h))\\
&= P_{ij}(t) +h\left( \sum_{k \in \mathbb{S}} P_{ik}(t)q_{kj} \right)+o(h)\\
P_{ij}(t+h)-P_{ij}(t)&= h(PQ)_{ij}+o(h),
\end{align*}$and dividing by $h$ then letting $h \to 0$ gives the forward equation.
> >
> > In the third equality, $o(h)=\sum_{k} P_{k}(t)\cdot o(h)$ is only justified in finite state spaces.
These equations motivates the identity $P(t)=e^{tQ}=(e^{Q})^{t},$which fills in the gaps between the discrete Markov chain $(I, P, P^{2},\dots)=(P(0), P(1),P(2),\dots)$.
A few properties follow:
- $P^{(k)}=Q^{k}$, where the derivative is again element-wise.